Windows系统下eclipse连接Linux系统hadoop集群提交MapReduce程序报错合集及解决方案

Windows系统下eclipse连接Linux系统hadoop集群提交MapReduce程序报错合集及解决方案,第1张

Windows系统下eclipse连接Linux系统hadoop集群提交MapReduce程序报错合集及解决方案

前言:作者使用的Hadoop版本是2.6.0,在Windows系统下使用eclipse编写MapReduce程序提交到集群运行遇到一些报错问题。现针对几种报错给出自己的解决方法。

错误一:

2021-11-29 16:28:48,544 ERROR [org.apache.hadoop.util.Shell] - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable G:hadoop-2.6.0hadoop-2.6.0binwinutils.exe in the Hadoop binaries.

报错原因:提示你缺少此程序,winutils.exe是在Windows系统上需要的hadoop调试环境工具,里面包含一些在Windows系统下调试hadoop、spark所需要的基本的工具类,另外在使用eclipse调试hadoop程序时候,也需要winutils.exe,需要配置上面的环境变量。

解决方法:一定要先在本地解压一份你所以用的Hadoop版本,然后将下载的对应版本的的winutils.exe复制到windowsHadoop在bin目录下并且添加环境变量,然后重启eclipse,如下图所示 *** 作:

错误二:

2021-11-29 16:23:40,016 INFO [org.apache.hadoop.yarn.client.RMProxy] - Connecting to ResourceManager at master/192.168.128.161:8032
Exception in thread “main” org.apache.hadoop.security.AccessControlException: Permission denied: user=zzp28, access=EXECUTE, inode="/tmp":root:supergroup:drwx------


报错原因:显示权限不够。
解决方法:

  • 方法1:添加代码:System.setProperty(“HADOOP_USER_NAME”, “root”); //给jvm虚拟机设置用户为root用来解决权限问题
  • 方法二:如下图所示:

错误三:

2021-11-29 16:05:16,942 INFO [org.apache.hadoop.mapreduce.Job] - Job job_1638172852270_0002 failed with state FAILED due to: Application application_1638172852270_0002 failed 2 times due to AM Container for appattempt_1638172852270_0002_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://master:18088/proxy/application_1638172852270_0002/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1638172852270_0002_02_000001
Exit code: 1
Exception message: /bin/bash: line 0: fg: no job control
Stack trace: ExitCodeException exitCode=1: /bin/bash: line 0: fg: no job control
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.

报错原因:这是因为Linux和windows的差异所引起的,例如:在Linux脚本中路径中的斜杠的写法是”/”,但是在Windows中就是””
解决方法:

  • 方法1:可以选择将所写的程序打包成jar包,然后在Linux中运行
  • 方法2:程序中添加跨平台提交的代码
conf.set("mapreduce.app-submission.cross-platform", "true");

错误四:

Cannot delete /tmp/hadoop-yarn/staging/root/.staging/job_1638172852270_0001. Name node is in safe mode.The reported blocks 15 has reached the threshold 0.9990 of total blocks 15. The number of live datanodes 3 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 6 seconds.

报错原因:节点处于安全模式中,解除安全模式即可
解决方法:master节点输入解除安全模式的命令

[root@master hadoop-2.6.0]# hadoop dfsadmin -safemode leave

错误五:

2021-11-29 16:58:54,478 INFO [org.apache.hadoop.mapreduce.Job] - Running job: job_1638172852270_00082021-11-29 16:58:58,585 INFO [org.apache.hadoop.mapreduce.Job] - Job job_1638172852270_0008 running in uber mode : false2021-11-29 16:58:58,585 INFO [org.apache.hadoop.mapreduce.Job] - map 0% reduce 0%2021-11-29 16:59:00,612 INFO [org.apache.hadoop.mapreduce.Job] - Task Id : attempt_1638172852270_0008_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class cn.educ.WordCountMapper not found
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class cn.educ.WordCountMapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:742)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)

报错原因:jar包没有发给Yarn,在封装的时候使用是job.setJarByClass(JobSubmitter.class);这个语句

解决方法:改为

job.setJar(‘d:/data/WordCount.jar’)

方法,并按照如下 *** 作打包Maven工程

然后这里要注意,Mavenjar包打包好了并不是在你指定的目录下,而是在当前项目的target目录下,然后将他复制到job.setJar()所指定的目录下,并修改名称。


所以该方法的最终写法是:

然后运行程序就会发现此时可以成功了。

错误六:

2021-11-30 18:55:17,287 WARN [org.apache.hadoop.mapred.MapTask] - Unable to initialize MapOutputCollector org.apache.hadoop.mapred.MapTask M a p O u t p u t B u f f e r j a v a . l a n g . C l a s s C a s t E x c e p t i o n : c l a s s c o m . s u n . j e r s e y . c o r e . i m p l . p r o v i d e r . e n t i t y . X M L J A X B E l e m e n t P r o v i d e r MapOutputBuffer java.lang.ClassCastException: class com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider MapOutputBufferjava.lang.ClassCastException:classcom.sun.jersey.core.impl.provider.entity.XMLJAXBElementProviderText


报错原因:导包语句错误

解决方法:将导包语句修改为import org.apache.hadoop.io.Text;,然后在打包成jar包就可以运行了。


错误七:

[root@master ~]# hadoop jar WordCount-0.0.1-SNAPSHOT.jar cn.educ.JiQunJobSubmisson /data/input /data/output
21/12/03 14:09:19 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.128.161:18040
21/12/03 14:09:20 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
21/12/03 14:09:22 INFO input.FileInputFormat: Total input paths to process : 1
21/12/03 14:09:22 INFO mapreduce.JobSubmitter: number of splits:1
21/12/03 14:09:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1638507905647_0001
21/12/03 14:09:23 INFO impl.YarnClientImpl: Submitted application application_1638507905647_0001
21/12/03 14:09:23 INFO mapreduce.Job: The url to track the job: http://master:18088/proxy/application_1638507905647_0001/
21/12/03 14:09:23 INFO mapreduce.Job: Running job: job_1638507905647_0001
21/12/03 14:09:34 INFO mapreduce.Job: Job job_1638507905647_0001 running in uber mode : false
21/12/03 14:09:34 INFO mapreduce.Job: map 0% reduce 0%
21/12/03 14:09:45 INFO mapreduce.Job: Task Id : attempt_1638507905647_0001_m_000000_0, Status : FAILED
Error: java.io.IOException: Unable to initialize any output collector
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:412)
at org.apache.hadoop.mapred.MapTask.access 100 ( M a p T a s k . j a v a : 81 ) a t o r g . a p a c h e . h a d o o p . m a p r e d . M a p T a s k 100(MapTask.java:81) at org.apache.hadoop.mapred.MapTask 100(MapTask.java:81)atorg.apache.hadoop.mapred.MapTaskNewOutputCollector.(MapTask.java:695)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
21/12/03 14:09:59 INFO mapreduce.Job: Task Id : attempt_1638507905647_0001_m_000000_1, Status : FAILED
Error: java.io.IOException: Unable to initialize any output collector

当我们选择在集群上面运行Jar包的时候出现:Error: java.io.IOException: Unable to initialize any output collector
这时候你也要看一下你的导包语句是否有问题,可能出现的是和错误六一样的导包语句错误,就是Text的导包语句选择错误,修改过来重新打包,然后上传到linux系统中在执行(注意:在执行的时候一定要注意HDFS文件中的输出路径删除了没有,如果没有删除还要报错!):

欢迎分享,转载请注明来源:内存溢出

原文地址: https://outofmemory.cn/zaji/5637984.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-16
下一篇 2022-12-16

发表评论

登录后才能评论

评论列表(0条)

保存