首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 软件管理 > 软件架构设计 >

hadoop 自定义inputformat跟outputformat

2013-02-19 
hadoop 自定义inputformat和outputformat?hadoop的inputformat和outputformat?最好的例子vertica :虽然是

hadoop 自定义inputformat和outputformat

?

hadoop的inputformat和outputformat

?

最好的例子vertica :虽然是在pig中实现的udf,但是就是hadoop的inputformat和outputformat,在hive里也可以照用,贴个下载的地址:http://blackproof.iteye.com/blog/1791995

?

再贴一个项目中,在实现hadoop join时,用的inputformat和outputformat的简单实例:

hadoop join在http://blackproof.iteye.com/blog/1757530

? ?自定义inputformat(泛型是maper的input)

public class MyOutputFormat extends FileOutputFormat<Text, Employee> {@Overridepublic RecordWriter<Text, Employee> getRecordWriter(TaskAttemptContext job) throws IOException, InterruptedException {// TODO Auto-generated method stubConfiguration conf = job.getConfiguration();Path file = getDefaultWorkFile(job, "");FileSystem fs = file.getFileSystem(conf);FSDataOutputStream fileOut = fs.create(file, false);return new MyRecordWriter(fileOut);}public static class MyRecordWriter extends RecordWriter<Text, Employee>{protected DataOutputStream out;private final byte[] keyValueSeparator; public static final String NEW_LINE = System.getProperty("line.separator");public MyRecordWriter(DataOutputStream out){this(out,":");}public MyRecordWriter(DataOutputStream out,String keyValueSeparator){this.out = out;this.keyValueSeparator = keyValueSeparator.getBytes();}@Overridepublic void write(Text key, Employee value) throws IOException,InterruptedException {if(key!=null){out.write(key.toString().getBytes());out.write(keyValueSeparator);}out.write(value.toString().getBytes());out.write(NEW_LINE.getBytes());}@Overridepublic void close(TaskAttemptContext context) throws IOException,InterruptedException {out.close();}}}

?

热点排行