python - Reading a tar compressed file in pandas? -


this code seems work , takes list of files , compresses them in format pandas can read, , combines them 1 location.

edit - modified code add new files (based on file not existing in tar).

os.chdir(r'c:\\users\documents\ftp\\') saveloc = r'\\fnp\mydownloads\\' compression = "w:bz2" extension = '.tar.bz2'  filename = 'global_performance' filetype = 'performance_*.csv' tarname = saveloc+filename+extension  files = glob(filetype) tar = tarfile.open(tarname, compression) file in files:     if file not in tarname:         tar.add(file) tar.close()  filename = 'global_status' filetype = 'status_*.csv' tarname = saveloc+filename+extension  files = glob(filetype) tar = tarfile.open(tarname, compression) file in files:     if file not in tarname:         tar.add(file) tar.close() 
  1. is there way pandas read tar file? can specify file know exists within file, or perhaps concat of files 1 read?
  2. being able add new files nice, assume computer has read file names determine if exists or not. there way modify code add latest files based on creation date or something? can sped compress , read newest files or perhaps within time range (30 days maybe instead of reading files in directory goes 2010)?
  3. as can see above, reading each file type within directory (based on filename) , adding separate tar. there way optimize bit instead of pasting same code on , on (there 10+ files to)?

edit - code seems operate slowly. intention find newest files not within tar , compress them , add them existing tar. based on time taking, thinking still compressing files , replacing them. can me make more efficient process.


Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -