HA介绍
为了解决Hadoop集群NameNode和ResourceManager单点问题Hadoop2.x推出高可用集群即Hadoop集群HA模式。
文档内容:https://gitee.com/TianHanXiaoXiaoSheng/everyday_learn/blob/master/hadoop-wc/Hadoop%E4%B9%8BHA%E6%90%AD%E5%BB%BA.md
HA模式搭建
基础环境准备
Centos6.x环境准备三台han-101、han-102、han-103
安装JDK1.8
固定ip
SSH免密登录
时间同步
HA集群节点规划
角色节点 han-101 han-102 han-103
ZK Y Y Y
KAFKA Y Y Y
JN Y Y Y
ZKFC Y Y
NN Y Y
DN Y Y Y
RM Y Y
NM Y Y Y
HS Y
ZK和KAFKA集群搭建不在此体现。
HA配置
下载hadoop-2.7.2.tar.gz以此为例。
创建安装目录解压tar.gz
$mkdir -p /opt/module
$tar -zxvf hadoop-2.7.2.tar.gz -C /opt/module/
修改env配置
hadoop-env.sh、mapred-env.sh、yarn-env.sh
修改添加JAVA环境变了 export JAVA_HOME=/opt/module/jdk1.8.0_281
环境变量和自己安装jdk实际路径为准。
修改xml配置
core-site.xml
fs.defaultFS hdfs://mycluster hadoop.tmp.dir /opt/module/hadoop/data ha.zookeeper.quorum han-101:2181,han-102:2181,han-103:2181hdfs-site.xml dfs.replication 3ip.client.connect.max.retries 100 ipc.client.connect.retry.interval 1000
yarn-site.xml yarn.nodemanager.aux-services mapreduce_shuffledfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 han-101:8020 dfs.namenode.rpc-address.mycluster.nn2 han-102:8020 dfs.namenode.http-address.mycluster.nn1 han-101:50070 dfs.namenode.http-address.mycluster.nn2 han-102:50070 dfs.namenode.shared.edits.dir qjournal://han-101:8485;han-102:8485;han-103:8485/mycluster dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /home/grid/.ssh/id_rsa dfs.journalnode.edits.dir /opt/module/hadoop/data/jn dfs.permissions.enable false dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.automatic-failover.enable true
mapred-site.xmlyarn.log-aggregation-enable true yarn.log.server.url http://han-103:19888/jobhistory/logs/ yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id cluster-yarn1 yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 han-102 yarn.resourcemanager.hostname.rm2 han-103 yarn.resourcemanager.zk-address han-101:2181,han-102:2181,han-103:2181 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
修改slave文件 添加DN节点hostname如下:mapreduce.framework.name yarn mapreduce.jobhistory.address han-103:10020 mapreduce.jobhistory.webapp.address han-103:19888
han-101
han-102
han-103
安装文件集群分发
scp -p ./hadoop grid@han-102:pwd
scp -p ./hadoop grid@han-103:pwd
启动ZK集群
单节点启动:
$ cd /opt/module/zookeeper/bin
$ ./zkServer.sh start
集群脚本:
#!/bin/bash
case $1 in
“start”){
for i in han-101 han-102 han-103
do
ssh $i “source /etc/profile; /opt/module/zookeeper/bin/zkServer.sh start”
done
};;
“stop”){
for i in han-101 han-102 han-103
do
ssh $i “source /etc/profile; /opt/module/zookeeper/bin/zkServer.sh stop”
done
};;
esac
启动所有JournalNode
在hadoop之所有JN节点上执行:
$ hadoop-daemon.sh start journalnode
格式化han-101节点NN
选择一个NN节点在hadoop之bin目录下执行:
$ hdfs dfs -format namenode
启动格式化后的节点:
$ hadoop-daemon.sh start namenode
在另一台NN节点中执行同步元数据:
$ hdfs namenode -bootstrapStanby
$ sbin/hadoop-daemon.sh start namenode
格式化ZKFC
$ hdfs zkfc -formatZK
启动HDFS
$ start-dfs.sh
设置NN为active
$ bin/hdfs haadmin -transitionToActive nn1
查看状态:
$ bin/hdfs haadmin -getServiceState nn1
启动Yarn
$ sbin/start-yarn.sh
启动另外一个RM:
$ sbin/yarn-daemon.sh start resourcemanager
查看RM状态:
$ bin/yarn rmadmin -getServiceState rm1
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)