hadoop配置lzo支持

hadoop配置lzo支持,第1张

hadoop配置lzo支持

1、 编译hadoop-lzo-0.4.21-SNAPSHOT.jar

2、上传hadoop-lzo-0.4.20.jar至/opt/module/hadoop-3.1.3/share/hadoop/common

3、修改core-site.xml


    io.compression.codecs
    
            org.apache.hadoop.io.compress.GzipCodec,
            org.apache.hadoop.io.compress.DefaultCodec,
            org.apache.hadoop.io.compress.BZip2Codec,
            org.apache.hadoop.io.compress.SnappyCodec,
            com.hadoop.compression.lzo.LzoCodec,
            com.hadoop.compression.lzo.LzopCodec
    


     io.compression.codec.lzo.class
     com.hadoop.compression.lzo.LzoCodec

4、分发hadoop-lzo-0.4.20.jar和core-site.xml,重启

scp ./hadoop-lzo-0.4.20.jar node02:`pwd`
scp ./core-site.xml node02:`pwd`

5、数据准备

hdfs dfs -mkdir /input
hadoop fs -put README.txt /input

6、运行

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec  /input /output

7、生成lzo文件

8、切片支持

hadoop fs -put bigtable.lzo /input
hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount -Dmapreduce.job.inputformat.class=com.hadoop.mapreduce.LzoTextInputFormat /input /output

 9、构建索引文件

hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-lzo-0.4.20.jar  com.hadoop.compression.lzo.DistributedLzoIndexer /input/bigtable.lzo

10、再次执行

hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount -Dmapreduce.job.inputformat.class=com.hadoop.mapreduce.LzoTextInputFormat /input /output

 

欢迎分享,转载请注明来源:内存溢出

原文地址: https://outofmemory.cn/zaji/5705981.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-17
下一篇 2022-12-17

发表评论

登录后才能评论

评论列表(0条)

保存