python - Pandas Optimized Way to Create Dummy-Variable? -

i creating new dummy variable based off of given column , criteria. below code working with. works slow do. there faster, maybe vectorized way create dummies in pandas? specifically, according example?

i have looked get_dummies function in pandas seems little different doing here. wrong though if has way make get_dummies work example, acceptable answer too.

def flagger(row, criteria, col):     if row[col] <= criteria:         return 1     if row[col] > criteria:         return 0  dstk['dropflag'] = dstk.apply(lambda row: flagger(row, criteria, col), axis=1)

edit: there 2 answers here. @ glance both equally fast (at least same order of magnitude) accepted one. if wants more serious profiling happy revise answer choice.

why not try np.where. it's column-wise vectorized operation , faster row-wise apply.

dstk['dropflag'] = np.where(dstk.col <= criteria, 1, 0)

Search This Blog

Mind Blowing Facts

python - Pandas Optimized Way to Create Dummy-Variable? -

Comments

Post a Comment

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -