hadoop 系统是怎样增加节点的

hadoop 系统是怎样增加节点的,第1张

Hadoop添加节点的方法

自己实际添加节点过程:

1. 先在slave上配置好环境,包括ssh,jdk,相关config,lib,bin等的拷贝;

2. 将新的datanode的host加到集群namenode及其他datanode中去;

3. 将新的datanode的ip加到master的conf/slaves中;

4. 重启cluster,在cluster中看到新的datanode节点;

5. 运行bin/start-balancer.sh,这个会很耗时间

备注:

1. 如果不balance,那么cluster会把新的数据都存放在新的node上,这样会降低mr的工作效率;

2. 也可调用bin/start-balancer.sh 命令执行,也可加参数 -threshold 5

threshold 是平衡阈值,默认是10%,值越低各节点越平衡,但消耗时间也更长。

3. balancer也可以在有mr job的cluster上运行,默认dfs.balance.bandwidthPerSec很低,为1M/s。在没有mr job时,可以提高该设置加快负载均衡时间。

其他备注:

1. 必须确保slave的firewall已关闭

2. 确保新的slave的ip已经添加到master及其他slaves的/etc/hosts中,反之也要将master及其他slave的ip添加到新的slave的/etc/hosts中

mapper及reducer个数

url地址: http://wiki.apache.org/hadoop/HowManyMapsAndReduces

HowManyMapsAndReduces

Partitioning your job into maps and reduces

Picking the appropriate size for the tasks for your job can radically change the performance of Hadoop. Increasing the number of tasks increases the framework overhead, but increases load balancing and lowers the cost of failures. At one extreme is the 1 map/1 reduce case where nothing is distributed. The other extreme is to have 1,000,000 maps/ 1,000,000 reduces where the framework runs out of resources for the overhead.

Number of Maps

The number of maps is usually driven by the number of DFS blocks in the input files. Although that causes people to adjust their DFS block size to adjust the number of maps. The right level of parallelism for maps seems to be around 10-100 maps/node, although we have taken it up to 300 or so for very cpu-light map tasks. Task setup takes awhile, so it is best if the maps take at least a minute to execute.

Actually controlling the number of maps is subtle. The mapred.map.tasks parameter is just a hint to the InputFormat for the number of maps. The default InputFormat behavior is to split the total number of bytes into the right number of fragments. However, in the default case the DFS block size of the input files is treated as an upper bound for input splits. A lower bound on the split size can be set via mapred.min.split.size. Thus, if you expect 10TB of input data and have 128MB DFS blocks, you'll end up with 82k maps, unless your mapred.map.tasks is even larger. Ultimately the [WWW] InputFormat determines the number of maps.

The number of map tasks can also be increased manually using the JobConf's conf.setNumMapTasks(int num). This can be used to increase the number of map tasks, but will not set the number below that which Hadoop determines via splitting the input data.

Number of Reduces

The right number of reduces seems to be 0.95 or 1.75 * (nodes * mapred.tasktracker.tasks.maximum). At 0.95 all of the reduces can launch immediately and start transfering map outputs as the maps finish. At 1.75 the faster nodes will finish their first round of reduces and launch a second round of reduces doing a much better job of load balancing.

Currently the number of reduces is limited to roughly 1000 by the buffer size for the output files (io.buffer.size * 2 * numReduces <<heapSize). This will be fixed at some point, but until it is it provides a pretty firm upper bound.

The number of reduces also controls the number of output files in the output directory, but usually that is not important because the next map/reduce step will split them into even smaller splits for the maps.

The number of reduce tasks can also be increased in the same way as the map tasks, via JobConf's conf.setNumReduceTasks(int num).

自己的理解:

mapper个数的设置:跟input file 有关系,也跟filesplits有关系,filesplits的上线为dfs.block.size,下线可以通过mapred.min.split.size设置,最后还是由InputFormat决定。

较好的建议:

The right number of reduces seems to be 0.95 or 1.75 multiplied by (<no. of nodes>* mapred.tasktracker.reduce.tasks.maximum).increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures.

<property>

<name>mapred.tasktracker.reduce.tasks.maximum</name>

<value>2</value>

<description>The maximum number of reduce tasks that will be run

simultaneously by a task tracker.

</description>

</property>

单个node新加硬盘

1.修改需要新加硬盘的node的dfs.data.dir,用逗号分隔新、旧文件目录

2.重启dfs

同步hadoop 代码

hadoop-env.sh

# host:path where hadoop code should be rsync'd from. Unset by default.

# export HADOOP_MASTER=master:/home/$USER/src/hadoop

用命令合并HDFS小文件

hadoop fs -getmerge <src><dest>

重启reduce job方法

Introduced recovery of jobs when JobTracker restarts. This facility is off by default.

Introduced config parameters "mapred.jobtracker.restart.recover", "mapred.jobtracker.job.history.block.size", and "mapred.jobtracker.job.history.buffer.size".

还未验证过。

IO写 *** 作出现问题

0-1246359584298, infoPort=50075, ipcPort=50020):Got exception while serving blk_-5911099437886836280_1292 to /172.16.100.165:

java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/

172.16.100.165:50010 remote=/172.16.100.165:50930]

at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:185)

at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)

at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)

at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:293)

at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:387)

at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:179)

at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:94)

at java.lang.Thread.run(Thread.java:619)

It seems there are many reasons that it can timeout, the example given in

HADOOP-3831 is a slow reading client.

解决办法:在hadoop-site.xml中设置dfs.datanode.socket.write.timeout=0试试;

My understanding is that this issue should be fixed in Hadoop 0.19.1 so that

we should leave the standard timeout. However until then this can help

resolve issues like the one you're seeing.

HDFS退服节点的方法

目前版本的dfsadmin的帮助信息是没写清楚的,已经file了一个bug了,正确的方法如下:

1. 将 dfs.hosts 置为当前的 slaves,文件名用完整路径,注意,列表中的节点主机名要用大名,即 uname -n 可以得到的那个。

2. 将 slaves 中要被退服的节点的全名列表放在另一个文件里,如 slaves.ex,使用 dfs.host.exclude 参数指向这个文件的完整路径

3. 运行命令 bin/hadoop dfsadmin -refreshNodes

4. web界面或 bin/hadoop dfsadmin -report 可以看到退服节点的状态是 Decomission in progress,直到需要复制的数据复制完成为止

5. 完成之后,从 slaves 里(指 dfs.hosts 指向的文件)去掉已经退服的节点

附带说一下 -refreshNodes 命令的另外三种用途:

2. 添加允许的节点到列表中(添加主机名到 dfs.hosts 里来)

3. 直接去掉节点,不做数据副本备份(在 dfs.hosts 里去掉主机名)

4. 退服的逆 *** 作——停止 exclude 里面和 dfs.hosts 里面都有的,正在进行 decomission 的节点的退服,也就是把 Decomission in progress 的节点重新变为 Normal (在 web 界面叫 in service)

Hadoop添加节点的方法

自己实际添加节点过程:

1. 先在slave上配置好环境,包括ssh,jdk,相关config,lib,bin等的拷贝;

2. 将新的datanode的host加到集群namenode及其他datanode中去;

3. 将新的datanode的ip加到master的conf/slaves中;

4. 重启cluster,在cluster中看到新的datanode节点;

5. 运行bin/start-balancer.sh,这个会很耗时间

备注:

1. 如果不balance,那么cluster会把新的数据都存放在新的node上,这样会降低mr的工作效率;

2. 也可调用bin/start-balancer.sh 命令执行,也可加参数 -threshold 5

threshold 是平衡阈值,默认是10%,值越低各节点越平衡,但消耗时间也更长。

3. balancer也可以在有mr job的cluster上运行,默认dfs.balance.bandwidthPerSec很低,为1M/s。在没有mr job时,可以提高该设置加快负载均衡时间。

其他备注:

1. 必须确保slave的firewall已关闭

2. 确保新的slave的ip已经添加到master及其他slaves的/etc/hosts中,反之也要将master及其他slave的ip添加到新的slave的/etc/hosts中

客户要求要回收一批hadoop集群的一批服务器,万幸namenode和resourcemanager服务没有安装在这批服务器上,但不巧的是3个journalnode节点都在这批服务器上,所以暂时先增加3个journalnode节点,等到回收服务器的时候,再把原来的journalnode节点下线,记录一下 *** 作

1.修改hdfs-site.xml配置文件

原配置为:

修改为:

2.分发hdfs-site.xml文件到各节点

3.将原journalnode上的edits文件scp到新的journalnode节点

从hdfs-site.xml文件中的dfs.journalnode.edits.dir配置项得到edits文件存储路径,scp到新节点的相同路径,注意权限和属主要相同,可以用scp -rp来复制

4.新journalnode节点启动journalnode进程

jps检查是否启动成功,如果失败就去看$HADOOP_HOME/logs下的journalnode相关的日志,讲道理应该没什么问题

5.把standby(nn2)节点的namenode重启一下

6.切换standby节点为active

7.重启standby(nn1)节点的namenode

*** 作同5,完成后web界面应该可以看到NameNode Journal Status的journalnode已扩展完成


欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/bake/11473103.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2023-05-16
下一篇 2023-05-16

发表评论

登录后才能评论

评论列表(0条)

保存