Python find and replace script doing incomplete job -
from answer previous question (efficient means of mass converting >600,000 different find , replaces within 1 file), received bit of python code 1000s of find/replaces en masse.
the code worked then, trying use conduct ~20,000 find/replaces on second column of table looks this:
1 1:565596 0 565596 1 1:567137 0 567137 1 rs7419119 0 842013 1 rs13302957 0 891021 1 rs6696609 0 903426 1 rs8997 0 949654
there 600,000 lines in overall table. these values colons (e.g., 1:565596) need replaced other values. each substitution needed on once in data table. have substitution table looks similar 1 used before.
coordinate affy_snp_id 17:4744823 affx-14001316 1:8362754 affx-14761301 3:128686912 affx-21139873 8:11830502 affx-31417216 12:23201352 affx-7455925
the adapted code above implemented such
import csv subs = dict(csv.reader(open('coordtoaffy.txt'), delimiter='\t')) source = csv.reader(open('neanderthal.map'), delimiter='\t') dest = csv.writer(open('neanderthal-affy.map', 'wb'), delimiter='\t') row in source: row[1] = subs.get(row[1], row[1]) dest.writerow(row)
the code worked before (in question linked above), in case only completing small subset of required substitutions. have input on how fix this? thanks!
Comments
Post a Comment