java - Sequence File created gives strange output in hadoop -
i want combine several small bzip2 files sequence file .i saw code create sequence file , tried it. gives strange output below. because unable read bzip2 files?
seqorg.apache.hadoop.io.textorg.apache.hadoop.io.text �*org.apache.hadoop.io.compress.defaultcodec����gwŒ‚Êo≈îbº¡vœÖ��� ��� .ds_storexúÌò± ¬0eÔ4.s∫a�6∞¢0p∞=0ì·‡/d)Ädï˛ì¨w≈ù7÷ùØ›⁄Öüo;≥x¬`’∂µóÆ Æâ¡=Ñ b±lp6Û˛ÜbÅå˜c¢3}ª‘�lp¥oä"ùËl?jk�&:⁄”Åét¢3]Î º∑¿˘¸68§ÄÉùø:µ√™*é-¿fifi>!~¯·0Ùˆú ¶ eõ¯c‡ÍÉa◊':”ÍÑòù;i1•�∂©���00.json.bz2xúl\gwtk∞% ,y ä( hjfêúsŒ\prrrŒ9ÁcŒ9√0ÃzuÏÌÊΩÔ≤Ù‚Ãô”’uªvÌÍÓ3£oˆä2ä<˝”-”ãȧπË/d;u¥Û£üv;ÀÒÛ¯Ú˜ˇ˚…≥2¢5Í0‰˝8m⁄,s¸¢`f•†`o<ëüd£≈tÃ¥ó`•´d˚~aº˝«õ˜v'≠)(f|§fiÆÕ ?y¬àœtÒÊyåb…u%e?⁄§efiwˇÒy#üÛÓÓ‚ ⁄è„ÍåÚÊu5‡ æ‚Â?q‘°�À{©?íwyü÷Èûf<[˘éŒhãd>x_ÅÁ fiÒ_eâ5-—|-m)˙)¸r·ªcÆßs„f>uŒ©ß{o„uÔ&∫˚˚Ÿ?Ä©ßw,”◊Ê∫â«õxã¸[yûgÈñfmx|‡ªÍ¶”¶‡Óp-∆ú§ı <jn t «f4™@Àä¥jœ¥‰√|e„‘œ„&º§@g|ˆá{iõox
the code
import java.io.ioexception; import org.apache.hadoop.conf.configuration; import org.apache.hadoop.fs.fsdatainputstream; import org.apache.hadoop.fs.filestatus; import org.apache.hadoop.fs.filesystem; import org.apache.hadoop.fs.path; import org.apache.hadoop.io.ioutils; import org.apache.hadoop.io.sequencefile; import org.apache.hadoop.io.text; import org.apache.hadoop.util.genericoptionsparser; public class cinput { /** * @param args * @throws ioexception * @throws illegalaccessexception * @throws instantiationexception */ public static void main(string[] args) throws ioexception, instantiationexception, illegalaccessexception { // todo auto-generated method stub configuration conf = new configuration(); filesystem fs = filesystem.get(conf); string[] otherargs = new genericoptionsparser(conf, args) .getremainingargs(); path inputfile = new path(otherargs[0]); path outputfile = new path(otherargs[1]); fsdatainputstream inputstream; text key = new text(); text value = new text(); sequencefile.writer writer = sequencefile.createwriter(fs, conf, outputfile, key.getclass(), value.getclass()); filestatus[] fstatus = fs.liststatus(inputfile); (filestatus fst : fstatus) { string str = ""; system.out.println("processing file : " + fst.getpath().getname() + " , size : " + fst.getpath().getname().length()); inputstream = fs.open(fst.getpath()); key.set(fst.getpath().getname()); while(inputstream.available()>0) { str = str+inputstream.readline(); // system.out.println(str); } value.set(str); writer.append(key, value); } fs.close(); ioutils.closestream(writer); system.out.println("sequence file created successfully........"); } }
the input passing json.bzip2 files. please point out why getting strange output.
Comments
Post a Comment