Python pandas create additional dataframe columns by grouping on existing column -


trying create new dataframe columns contents of existing column. easier explain example. convert this:

.	yr	month	class	cost  1	2015	1	l	19.2361  2	2015	1	m	29.4723  3	2015	1	s	48.5980  4	2015	1	t	169.7630  5	2015	2	l	19.1506  6	2015	2	m	30.0886  7	2015	2	s	49.3765  8	2015	2	t	167.0000  9	2015	3	l	19.3465  10	2015	3	m	29.1991  11	2015	3	s	46.2580  12	2015	3	t	157.7916  13	2015	4	l	18.3165  14	2015	4	m	28.2314  15	2015	4	s	44.5844  16	2015	4	t	162.3241  17	2015	5	l	17.4556  18	2015	5	m	27.0434  19	2015	5	s	42.8841  20	2015	5	t	159.3457  21	2015	6	l	16.5343  22	2015	6	m	24.9853  23	2015	6	s	40.5612  24	2015	6	t	153.4902

...into following can plot 4 separate lines [l, m, s, t]:

.	 yr	  month	   l	   m	   s	   t  1	2015	1	19.2361	29.4723	48.5980	169.7630  2	2015	2	19.1506	30.0886	49.3765	167.0000  3	2015	3	19.3465	29.1991	46.2580	157.7916  4	2015	4	18.3165	28.2314	44.5844	162.3241  5	2015	5	17.4556	27.0434	42.8841	159.3457  6	2015	6	16.5343	24.9853	40.5612	153.4902

i able work through in feels clumsy way, filtering dataframe on 'class' column... , 3 separate merges.

list_class = ['l', 'm', 's', 't'] year = 'yr' month = 'month' df_class = pd.dataframe() df_class1 = pd.dataframe() df_class2 = pd.dataframe() df_class1 = merge(df[[month, year, 'class','cost']][df['class']==list_class[0]], df[[month, year, 'class','cost']][df['class']==list_class[1]], \             left_on=[month, year], right_on=[month, year]) df_class2 = merge(df[[month, year, 'class','cost']][df['class']==list_class[2]], df[[month, year, 'class','cost']][df['class']==list_class[3]], \             left_on=[month, year], right_on=[month, year]) df_class = merge(df_class1, df_class2, left_on=[month, year], right_on=[month, year]).groupby([year, month]).mean().plot(figsize(15,8)) 

there must more efficient way. feels should done groupby, couldn't nail down.

you can first convert df multi-level index type , unstack level class give want. suppose df original dataframe shown on beginning of post.

df.set_index(['yr', 'month', 'class'])['cost'].unstack('class')  out[29]:  class             l        m        s         t yr   month                                      2015 1      19.2361  29.4723  48.5980  169.7630      2      19.1506  30.0886  49.3765  167.0000      3      19.3465  29.1991  46.2580  157.7916      4      18.3165  28.2314  44.5844  162.3241      5      17.4556  27.0434  42.8841  159.3457      6      16.5343  24.9853  40.5612  153.4902 

Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -