python - KeyError in for loop of dataframe in pandas -
i putting data bokeh layout of heat map, getting keyerror: '1'. occurs right @ line num_calls = pivot_table[m][y]
know why be?
the pivot table using below:
pivot_table.head() out[101]: month 1 2 3 4 5 6 7 8 9 companyname company 1 182 270 278 314 180 152 110 127 129 company 2 163 147 192 142 186 231 214 130 112 company 3 126 88 99 139 97 97 96 37 79 company 4 84 89 71 95 80 89 83 88 104 company 5 91 96 94 66 81 77 87 83 68 month 10 11 12 companyname company 1 117 127 81 company 2 117 93 101 company 3 116 111 95 company 4 93 78 64 company 5 83 95 65
below section of code leading error:
pivot_table = pivot_table.reset_index() pivot_table['companyname'] = [str(x) x in pivot_table['companyname']] companies = list(pivot_table['companyname']) months = ["1","2","3","4","5","6","7","8","9","10","11","12"] pivot_table = pivot_table.set_index('companyname') # colormap original plot colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce", "#ddb7b1", "#cc7878", "#933b41", "#550b1d" ] # set data plotting. need have values every # pair of year/month names. map rate color. month = [] company = [] color = [] rate = [] y in companies: m in months: month.append(m) company.append(y) num_calls = pivot_table[m][y] rate.append(num_calls) color.append(colors[min(int(num_calls)-2, 8)])
and upon request:
pivot_table.info() <class 'pandas.core.frame.dataframe'> index: 46 entries, company1 lastcompany data columns (total 12 columns): 1.0 46 non-null float64 2.0 46 non-null float64 3.0 46 non-null float64 4.0 46 non-null float64 5.0 46 non-null float64 6.0 46 non-null float64 7.0 46 non-null float64 8.0 46 non-null float64 9.0 46 non-null float64 10.0 46 non-null float64 11.0 46 non-null float64 12.0 46 non-null float64 dtypes: float64(12) memory usage: 4.5+ kb
and
pivot_table.columns out[103]: index([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0], dtype='object')
also bokeh code here: http://bokeh.pydata.org/en/latest/docs/gallery/unemployment.html
i've tried following code , works on pc. use .loc
aim avoid potential key error.
import pandas pd import numpy np # following previous post simulate data np.random.seed(0) dates = np.random.choice(pd.date_range('2015-01-01 00:00:00', '2015-06-30 00:00:00', freq='1h'), 10000) company = np.random.choice(['company' + x x in '1 2 3 4 5'.split()], 10000) df = pd.dataframe(dict(recvd_dttm=dates, companyname=company)).set_index('recvd_dttm').sort_index() df['c'] = 1 df.columns = ['companyname', ''] result = df.groupby([lambda idx: idx.month, 'companyname']).agg({df.columns[1]: sum}).reset_index() result.columns = ['month', 'companyname', 'counts'] pivot_table = result.pivot(index='companyname', columns='month', values='counts') colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce", "#ddb7b1", "#cc7878", "#933b41", "#550b1d" ] month = [] company = [] color = [] rate = [] y in pivot_table.index: m in pivot_table.columns: month.append(m) company.append(y) num_calls = pivot_table.loc[y, m] rate.append(num_calls) color.append(colors[min(int(num_calls)-2, 8)])
Comments
Post a Comment