Hadoop HA高可用部署

Hadoop HA高可用部署,第1张

Hadoop HA高可用部署

文章目录

Hadoop HA高可用安装

主机基础环境配置安装JAVA环境安装zookeeper安装Hadoop

开始修改配置开始启动页面访问测试 可能存在的其他疑惑

1. edits日志数量特别多2. 高可用状态下主备切换有问题3. 阿里云的emr集群配置

Hadoop HA高可用安装

此方案注意问题

hdfs-site.xml文件中的dfs.ha.fencing.methods参数为shell而非sshfence.因为sshfence存在主节点所在主机宕机(主机宕机而非停止服务)无法切换问题.但是百度到的大部分Hadoop HA相关文章都是使用的sshfence方式.

官方参考文档:https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

官网原文:The sshfence option SSHes to the target node and uses fuser to kill the process listening on the service’s TCP port. In order for this fencing option to work, it must be able to SSH to the target node without providing a passphrase. Thus, one must also configure the dfs.ha.fencing.ssh.private-key-files option, which is a comma-separated list of SSH private key files. For example:

当active的namenode连接失败后,standby会去ssh到之前active的namenode节点,然后再次杀一遍进程.

问题在于,如果之前active的namenode节点挂掉了,standby的节点ssh不到这个节点,也就无法切换.所以这种高可用只适用于预防namenode服务死掉而不能承受节点死掉.如果active的节点挂掉了,那么,从节点的zkfc日志中将能看到java.net.NoRouteToHostException: No route to host (Host unreachable)的异常.

注意点2: 为了防止脑裂问题,Hadoop HA也采用了半数机制.即(n - 1) / 2.所以,至少需要运行三个journalnode服务,也就是三个主节点(一个主节点两个备用节点).

官方原文:

JournalNode machines - the machines on which you run the JournalNodes. The JournalNode daemon is relatively lightweight, so these daemons may reasonably be collocated on machines with other Hadoop daemons, for example NameNodes, the JobTracker, or the YARN ResourceManager. Note: There must be at least 3 JournalNode daemons, since edit log modifications must be written to a majority of JNs. This will allow the system to tolerate the failure of a single machine. You may also run more than 3 JournalNodes, but in order to actually increase the number of failures the system can tolerate, you should run an odd number of JNs, (i.e. 3, 5, 7, etc.). Note that when running with N JournalNodes, the system can tolerate at most (N - 1) / 2 failures and continue to function normally.

文件中列出了每个文件的所有配置,很多配置并不是高可用的(如配置每个节点可以申请到的资源)
很多解释为了准确性粘贴了官方原文,译文可以用 https://fanyi.youdao.com/

主机基础环境配置

关闭防火墙,关闭selinux,确保节点通信正常

    修改主机名(因机器数量较少所以很多组件共用一个节点)(y因高可用要求所以准备三个主节点)

    IP地址主机名描述20.88.10.31emr-header-01主节点,zookeeper节点20.88.10.32emr-header-02备用主节点,zookeeper节点20.88.10.33emr-worker-01备用主节点,zookeeper节点,工作节点20.88.10.34emr-worker-02工作节点

    修改hosts文件和主机名

    hostnamectl set-hostname emr-header-01;bash
    

    hosts文件内容添加(每个节点都一样)

    此项必 *** 作

    因为Hadoop配置中指定了主节点及从节点主机名,如果没有加hosts,解析主机名会失败

    20.88.10.31 emr-header-01
    20.88.10.32 emr-header-02
    20.88.10.33 emr-worker-01
    20.88.10.34 emr-worker-02
    

    节点互相免密(emr-header-01节点执行)

    Hadoop启动时会ssh连接到其他节点,如果不做免秘钥会提示输入密码

    ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
    cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
    chmod 0600 ~/.ssh/authorized_keys
    

    分发到每个主机

    sshpass -psinobase@123  ssh-copy-id -i /root/.ssh/id_dsa.pub "-o StrictHostKeyChecking=no"  root@emr-header-01
    sshpass -psinobase@123  ssh-copy-id -i /root/.ssh/id_dsa.pub "-o StrictHostKeyChecking=no"  root@emr-header-02
    sshpass -psinobase@123  ssh-copy-id -i /root/.ssh/id_dsa.pub "-o StrictHostKeyChecking=no"  root@emr-worker-01
    sshpass -psinobase@123  ssh-copy-id -i /root/.ssh/id_dsa.pub "-o StrictHostKeyChecking=no"  root@emr-worker-02
    
安装JAVA环境

安装系统centos7,用户为root权限(没有也可以),所有包安装目录为/opt

所有节点均需安装

按照版本所需下载对应jdk,下载略过,Java官网:https://www.oracle.com/java/technologies/downloads/

tar xf jdk-8u181-linux-x64.tar.gz -C /opt/
echo 'export JAVA_HOME=/opt/jdk1.8.0_181/
export PATH=${PATH}:${JAVA_HOME}/bin' >>/etc/profile
source /etc/profile
java -version
安装zookeeper

zookeeper安装于emr-header-01,emr-header-02,emr-worker-01三个节点

zookeeper官方下载地址:https://zookeeper.apache.org/releases.html

emr-header-01节点 *** 作:

# 下载后上传,此处略过
tar xf zookeeper-3.4.13.tar.gz -C /opt/

修改配置

cd /opt/zookeeper-3.4.13/conf/
vim zoo.cfg

全部配置:

minSessionTimeout=16000
maxSessionTimeout=300000
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper/data
dataLogDir=/data/zookeeper/datalog
clientPort=2181
server.1=emr-header-01:2888:3888
server.2=emr-header-02:2888:3888
server.3=emr-worker-01:2888:3888

添加变量

echo 'ZK_HOME=/opt/zookeeper-3.4.13' >>/etc/profile
echo 'PATH=$PATH:${ZK_HOME}/bin/' >>/etc/profile
source /etc/profile

分发到其他节点

scp -r /opt/* emr-header-02:/opt/
scp -r /opt/* emr-worker-01:/opt/
scp -r /opt/* emr-worker-02:/opt/
scp /etc/profile emr-header-02:/etc/
scp /etc/profile emr-worker-01:/etc/
scp /etc/hosts emr-header-02:/etc/
scp /etc/hosts emr-worker-01:/etc/
scp /etc/hosts emr-worker-02:/etc/

所有节点执行

mkdir -p /data/zookeeper/data
cd /data/zookeeper/data

注意:此处每个节点执行内容不同

每个文件的数字全局唯一同时不能随便写

如emr-header-01的数字1对应的是上面配置中的server.1=emr-header-01:2888:3888 的 1

# emr-header-01节点执行
echo "1" >myid
# emr-header-02节点执行
echo "2" >myid
# emr-worker-01节点执行
echo "3" >myid

启动,所有节点执行

source /etc/profile;zkServer.sh start

检查状态

zkServer.sh status

三个节点执行后有两个是Mode: follower一个Mode: leader即为成功

安装Hadoop

官网地址:https://archive.apache.org/dist/hadoop/common/

此处使用清华源下载,地址为:https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.10.1/hadoop-2.10.1.tar.gz

emr-header-01执行

curl -O https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.10.1/hadoop-2.10.1.tar.gz
tar xf hadoop-2.10.1.tar.gz -C /opt/
cd /opt/hadoop-2.10.1/etc/hadoop/

检查是否有fuser命令,没有则安装(所有节点)

如果是sshfence方式回使用到这个命令,因为他为了防止脑裂要ssh连接到已经down掉的NameNode节点重新杀一遍NameNode的进程.而杀进程的时候回用到这条命令,如果没有的话sshfence方式在主节点执行命令hadoop-daemon,.sh stop namenode停止掉NameNode节点也是无法切换过去的,更不用说停掉主节点的主机.

yum install -y psmisc
开始修改配置

如果主机名和解析不同,需要修改文件中对应的内容

注意yarn-site.xml文件中的yarn.nodemanager.resource.memory-mb和yarn.nodemanager.resource.cpu-vcores参数,改参数用于该节点可被调用的最大CPU数量和内存大小,此处设置为主机内存一半.
配置中除了高可用相关配置还有其他配置,如开启日志收集设置节点最大可用资源等,可以单独加入配置也可以直接覆盖之前所有配置,Hadoop版本1.10.1测试可用.

hadoop-env.sh第一行插入,或者修改文件中的JAVA_HOME变量

source /etc/profile

slaves文件内容(删除原有的localhost)

# 此文件中写所有从节点别名
emr-worker-01
emr-worker-02

core-site.xml文件内容,注意文件中的不要重复







	
		fs.defaultFS
		
		hdfs://emr-header-01:9000
		The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.
	
	
		
		hadoop.tmp.dir
		/data/hadoop/tmp
		A base for other temporary directories.
	
	
		
		fs.defaultFS
		hdfs://mycluster
		The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.
	
	
		
		hadoop.tmp.dir
		/data/tmp
		A base for other temporary directories.
	
	
		
		ha.zookeeper.quorum
		emr-header-01:2181,emr-header-02:2181,emr-worker-01:2181
		A list of ZooKeeper server addresses, separated by commas, that are to be used by the ZKFailoverController in automatic failover.
	


yarn-site.xml






	
    	
		yarn.nodemanager.aux-services
		mapreduce_shuffle
		A comma separated list of services where service name should only contain a-zA-Z0-9_ and can not start with numbers
	
	
    	
		yarn.resourcemanager.hostname
		emr-header-01
		The hostname of the RM.
	
	
    	
    	yarn.nodemanager.resource.cpu-vcores
    	2
		Number of vcores that can be allocated for containers. This is used by the RM scheduler when allocating resources for containers. This is not used to limit the number of CPUs used by YARN containers. If it is set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true, it is automatically determined from the hardware in case of Windows and Linux. In other cases, number of vcores is 8 by default.
	
	
    	
    	yarn.nodemanager.resource.memory-mb
    	4096
		Amount of physical memory, in MB, that can be allocated for containers. If set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true, it is automatically calculated(in case of Windows and Linux). In other cases, the default is 8192MB.
	
	
    	
    	yarn.scheduler.maximum-allocation-vcores
    	4
		The maximum allocation for every container request at the RM in terms of virtual CPU cores. Requests higher than this will throw an InvalidResourceRequestException.
	
	
    	
    	yarn.scheduler.maximum-allocation-mb
    	3072
		The maximum allocation for every container request at the RM in MBs. Memory requests higher than this will throw an InvalidResourceRequestException.
	
	
    	
    	yarn.scheduler.minimum-allocation-vcores
    	1
			The minimum allocation for every container request at the RM in terms of virtual CPU cores. Requests lower than this will be set to the value of this property. Additionally, a node manager that is configured to have fewer virtual cores than this value will be shut down by the resource manager.
	
	
    	
    	yarn.scheduler.minimum-allocation-mb
    	1024
		The minimum allocation for every container request at the RM in MBs. Memory requests lower than this will be set to the value of this property. Additionally, a node manager that is configured to have less memory than this value will be shut down by the resource manager.
	
	
		
		yarn.log-aggregation-enable
		true
		Whether to enable log aggregation. Log aggregation collects each container's logs and moves these logs onto a file-system, for e.g. HDFS, after the application completes. Users can configure the "yarn.nodemanager.remote-app-log-dir" and "yarn.nodemanager.remote-app-log-dir-suffix" properties to determine where these logs are moved to. Users can access the logs via the Application Timeline Server.
	
	
		
		yarn.log-aggregation.retain-seconds
		604800
			How long to keep aggregation logs before deleting them. -1 disables. Be careful set this too small and you will spam the name node.
	
    
        yarn.nodemanager.aux-services
        mapreduce_shuffle
		A comma separated list of services where service name should only contain a-zA-Z0-9_ and can not start with numbers
    
    
    
        yarn.resourcemanager.ha.enabled
        true
		Enable RM high-availability. When enabled, (1) The RM starts in the Standby mode by default, and transitions to the Active mode when prompted to. (2) The nodes in the RM ensemble are listed in yarn.resourcemanager.ha.rm-ids (3) The id of each RM either comes from yarn.resourcemanager.ha.id if yarn.resourcemanager.ha.id is explicitly specified or can be figured out by matching yarn.resourcemanager.address.{id} with local address (4) The actual physical addresses come from the configs of the pattern - {rpc-config}.{id}
    
    
	
		yarn.resourcemanager.cluster-id
		cluster-yarn
		Name of the cluster. In a HA setting, this is used to ensure the RM participates in leader election for this cluster and ensures it does not affect other clusters
	
	
	
		yarn.resourcemanager.ha.rm-ids
		rm1,rm2
		The list of RM nodes in the cluster when HA is enabled. See description of yarn.resourcemanager.ha .enabled for full details on how this is used.
	
	
	
	
		yarn.resourcemanager.hostname.rm1
		emr-header-01
	
	
	
		yarn.resourcemanager.webapp.address.rm1
		emr-header-01:8088
	
	
	
		yarn.resourcemanager.address.rm1
		emr-header-01:8032
	
	
	
		yarn.resourcemanager.scheduler.address.rm1
		emr-header-01:8030
	
	
	
		yarn.resourcemanager.resource-tracker.address.rm1
		emr-header-01:8031
	
	
	
	
		yarn.resourcemanager.hostname.rm2
		emr-header-02
	
	
		yarn.resourcemanager.webapp.address.rm2
		emr-header-02:8088
	
	
		yarn.resourcemanager.address.rm2
        emr-header-02:8032
	
	
		yarn.resourcemanager.scheduler.address.rm2
		emr-header-02:8030
	
	
		yarn.resourcemanager.resource-tracker.address.rm2
		emr-header-02:8031
	
	
	
		yarn.resourcemanager.zk-address
		emr-header-01:2181,emr-header-02:2181,emr-header-03:2181
		
	
	
	
		yarn.resourcemanager.recovery.enabled
		true
		Enable RM to recover state after starting. If true, then yarn.resourcemanager.store.class must be specified.
	
	
	
		yarn.resourcemanager.store.class
		org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
		The class to use as the persistent store. If org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore is used, the store is implicitly fenced; meaning a single ResourceManager is able to use the store at any point in time. More details on this implicit fencing, along with setting up appropriate ACLs is discussed under yarn.resourcemanager.zk-state-store.root-node.acl.
	
	
	
		yarn.nodemanager.env-whitelist
		JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
		Environment variables that containers may override rather than use NodeManager's default.
	


hdfs-site.xml








    
	
		dfs.replication
		1
	
    
    
		dfs.namenode.secondary.http-address
		emr-header-01:50090
	
    
    
        dfs.namenode.name.dir
        file://${hadoop.tmp.dir}/name
    
    
    
        dfs.datanode.data.dir
        file://${hadoop.tmp.dir}/data
        Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for HDFS storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Directories that do not exist will be created if local filesystem permission allows.
    
    
    
        dfs.journalnode.edits.dir
        ${hadoop.tmp.dir}/jn
        The directory where the journal edit files are stored.
    
    
    
        dfs.nameservices
        mycluster
        Comma-separated list of nameservices.
    
    
    
        dfs.ha.namenodes.mycluster
        nn1,nn2,nn3
        dfs.ha.namenodes.EXAMPLENAMESERVICE The prefix for a given nameservice, contains a comma-separated list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE). Unique identifiers for each NameNode in the nameservice, delimited by commas. This will be used by DataNodes to determine all the NameNodes in the cluster. For example, if you used “mycluster” as the nameservice ID previously, and you wanted to use “nn1” and “nn2” as the individual IDs of the NameNodes, you would configure a property dfs.ha.namenodes.mycluster, and its value "nn1,nn2".
    
    
    
        dfs.namenode.rpc-address.mycluster.nn1
        emr-header-01:8020
    
    
        dfs.namenode.rpc-address.mycluster.nn2
        emr-header-02:8020
    
    
        dfs.namenode.rpc-address.mycluster.nn3
        emr-worker-01:8020
    
    
    
        dfs.namenode.http-address.mycluster.nn1
        emr-header-01:50070
    
    
        dfs.namenode.http-address.mycluster.nn2
        emr-header-02:50070
    
    
        dfs.namenode.http-address.mycluster.nn3
        emr-worker-01:50070
    
    
    
        dfs.namenode.shared.edits.dir
        qjournal://emr-header-01:8485;emr-header-02:8485;emr-worker-01:8485/mycluster
        A directory on shared storage between the multiple namenodes in an HA cluster. This directory will be written by the active and read by the standby in order to keep the namespaces synchronized. This directory does not need to be listed in dfs.namenode.edits.dir above. It should be left empty in a non-HA cluster.
    
    
        dfs.client.failover.proxy.provider.mycluster
        org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
    
    
    
        dfs.ha.fencing.methods
        shell(/bin/true)
        A list of scripts or Java classes which will be used to fence the Active NameNode during a failover. See the HDFS High Availability documentation for details on automatic HA configuration.
    
    
    
        dfs.ha.fencing.ssh.private-key-files
        /root/.ssh/id_dsa
    
    
    
           dfs.ha.automatic-failover.enabled
           true
           Whether automatic failover is enabled. See the HDFS High Availability documentation for details on automatic HA configuration.
     
	 
		  dfs.namenode.checkpoint.txns
		  1000000
		The Secondary NameNode or CheckpointNode will create a checkpoint of the namespace every 'dfs.namenode.checkpoint.txns' transactions, regardless of whether 'dfs.namenode.checkpoint.period' has expired.
	
	
		dfs.namenode.checkpoint.check.period
		60
		The SecondaryNameNode and CheckpointNode will poll the NameNode every 'dfs.namenode.checkpoint.check.period' seconds to query the number of uncheckpointed transactions. Support multiple time unit suffix(case insensitive), as described in dfs.heartbeat.interval.If no time unit is specified then seconds is assumed.
	
    
        
        dfs.namenode.num.extra.edits.retained
        1000000
        The number of extra transactions which should be retained beyond what is minimally necessary for a NN restart. It does not translate directly to file's age, or the number of files kept, but to the number of transactions (here "edits" means transactions). One edit file may contain several transactions (edits). During checkpoint, NameNode will identify the total number of edits to retain as extra by checking the latest checkpoint transaction value, subtracted by the value of this property. Then, it scans edits files to identify the older ones that don't include the computed range of retained transactions that are to be kept around, and purges them subsequently. The retainment can be useful for audit purposes or for an HA setup where a remote Standby Node may have been offline for some time and need to have a longer backlog of retained edits in order to start again. Typically each edit is on the order of a few hundred bytes, so the default of 1 million edits should be on the order of hundreds of MBs or low GBs. NOTE: Fewer extra edits may be retained than value specified for this setting if doing so would mean that more segments would be retained than the number configured by dfs.namenode.max.extra.edits.segments.retained.
    
    
        
        dfs.namenode.num.checkpoints.retained
        2
        The number of image checkpoint files (fsimage_*) that will be retained by the NameNode and Secondary NameNode in their storage directories. All edit logs (stored on edits_* files) necessary to recover an up-to-date namespace from the oldest retained checkpoint will also be retained.
    
    
        dfs.namenode.max.extra.edits.segments.retained
        10000
        The maximum number of extra edit log segments which should be retained beyond what is minimally necessary for a NN restart. When used in conjunction with dfs.namenode.num.extra.edits.retained, this configuration property serves to cap the number of extra edits files to a reasonable value.
    


mapred-site.xml,需要先复制出一份cp mapred-site.xml.template mapred-site.xml









    
	dfs.replication
	1


    
	dfs.namenode.secondary.http-address
	emr-header-01:50090



编辑/opt/hadoop-2.10.1/sbin/start-yarn.sh

# start resourceManager
"$bin"/yarn-daemon.sh --config $YARN_CONF_DIR  start resourcemanager
# 插入下面这一段 启动时同时启动emr-header-02节点的resourcemanager进程
ssh emr-header-02 "$bin"/yarn-daemon.sh --config $YARN_CONF_DIR  start resourcemanager
# start nodeManager
"$bin"/yarn-daemons.sh --config $YARN_CONF_DIR  start nodemanager
# start proxyserver
#"$bin"/yarn-daemon.sh --config $YARN_CONF_DIR  start proxyserver

编辑/opt/hadoop-2.10.1/sbin/stop-yarn.sh

# stop resourceManager
"$bin"/yarn-daemon.sh --config $YARN_CONF_DIR  stop resourcemanager
# 插入下面一行 停止任务时同时停止emr-header-02节点的resource-manager进程
ssh emr-header-02 "$bin"/yarn-daemon.sh --config $YARN_CONF_DIR  stop resourcemanager
# stop nodeManager
"$bin"/yarn-daemons.sh --config $YARN_CONF_DIR  stop nodemanager
# stop proxy server
"$bin"/yarn-daemon.sh --config $YARN_CONF_DIR  stop proxyserver

插入环境变量

echo 'export HADOOP_HOME=/opt/hadoop-2.10.1/' >>/etc/profile
echo 'export PATH=${PATH}:${HADOOP_HOME}/sbin/:${HADOOP_HOME}/bin/' >>/etc/profile
source /etc/profile

复制到其他节点(emr-header-01节点 *** 作)

# 复制密钥到其他节点,因为上面配置了dfs.ha.fencing.ssh.private-key-files参数指定了密钥文件
scp -r /root/.ssh/* emr-header-02:/root/.ssh/
scp /etc/profile emr-header-02:/etc/
scp /etc/profile emr-worker-01:/etc/
scp /etc/profile emr-worker-02:/etc/
scp -r /opt/hadoop-2.10.1/ emr-header-02:/opt/
scp -r /opt/hadoop-2.10.1/ emr-worker-01:/opt/
scp -r /opt/hadoop-2.10.1/ emr-worker-02:/opt/
开始启动

所有节点刷新环境变量

source /etc/profile

emr-header-01,emr-header-02,emr-worker-01 三个节点执行

hadoop-daemon.sh start journalnode

若想检查启动情况见日志,路径与/opt/hadoop-2.10.1/logs/hadoop-root-journalnode-emr-header-01.log,不同节点文件名不同.

emr-header-01节点执行

# 此条重复执行会提示输入y
hdfs zkfc -formatZK
hdfs namenode -format
hadoop-daemon.sh start namenode

emr-header-02和emr-worker-02节点执行(同步namenode数据)

hdfs namenode -bootstrapStandby

emr-header-01节点运行

stop-all.sh

开始启动集群

start-all.sh
页面访问

建议现在本地电脑添加hosts,不添加同样可用,不过访问到resourceManager备用节点时(也就是YARN)会跳转到主节点的主机别名.

hosts文件内容(同上面每个主机的)

20.88.10.31 emr-header-01
20.88.10.32 emr-header-02
20.88.10.33 emr-worker-01
20.88.10.34 emr-worker-02

访问header1的

# HDFS 每个运行了journalnode的都可以访问
http://20.88.10.31:50070/
http://20.88.10.32:50070/
http://20.88.10.33:50070/

HDFS第一行Overview 'emr-header-01:8020' (active)为主节点,Overview 'emr-header-02:8020' (standby)为备用节点

# YARN 或者MapReduce
http://20.88.10.31:8088/

查看节点状态

rm1和rm2从上面配置指定,rm1为emr-header-01节点,rm2为emr-header-02节点.

[root@emr-header-01 ~]#  yarn rmadmin -getServiceState rm1
standby
[root@emr-header-01 ~]#  yarn rmadmin -getServiceState rm2
active
[root@emr-header-01 ~]# 
测试

根据上面查看的情况,HDFS主节点为emr-header-01,而Yarn主节点为emr-header-02.

# emr-header-01 执行,停止namenode,注意是hadoop-daemon.sh而不是hadoop-daemons.sh
hadoop-daemon.sh stop namenode
# emr-header-02节点,执行命令停止resourceManager
yarn-daemon.sh stop resourcemanager

Yarn header02节点已不可访问,访问http://emr-header-01:8088/cluster可用.

HDFS header-01节点不可访问,访问header-02 http://emr-header-02:50070/dfshealth.html#tab-overview,显示为active状态.

停止的服务重新启动,即自动成为备用节点,启动命令(stop在那个节点执行的,对应的start同样在那个节点执行):

yarn-daemon.sh start resourcemanager
hadoop-daemon.sh start namenode

当上面的测试都可用的时候,那么可以随意停掉任何一个主节点的机器模拟故障,查看是否能够切换过去.dfs.ha.fencing.methods参数为shell方式是没问题的而sshfence是切换不过去的,原因可见开头.相对比之下,shell方式主备切换过程会比较长需要多等待一会(测试大概一分钟以内),而sshfence方式会很快在active节点down掉之后补上去.但是shell方式可以承受主机节点的异常.

可能存在的其他疑惑 1. edits日志数量特别多

如果经常观察Hadoop的NameNode数据存储目录,可以发现高可用情况下NameNode的edits日志数量越来越多.
这个是通过hdfs配置中的dfs.namenode.num.extra.edits.retained参数控制的,高可用情况下,默认保留的edits日志数量是1000000个.
对此Hadoop官方给出的解释是:

The number of extra transactions which should be retained beyond what is minimally necessary for a NN restart. It does not translate directly to file’s age, or the number of files kept, but to the number of transactions (here “edits” means transactions). One edit file may contain several transactions (edits). During checkpoint, NameNode will identify the total number of edits to retain as extra by checking the latest checkpoint transaction value, subtracted by the value of this property. Then, it scans edits files to identify the older ones that don’t include the computed range of retained transactions that are to be kept around, and purges them subsequently. The retainment can be useful for audit purposes or for an HA setup where a remote Standby Node may have been offline for some time and need to have a longer backlog of retained edits in order to start again. Typically each edit is on the order of a few hundred bytes, so the default of 1 million edits should be on the order of hundreds of MBs or low GBs. NOTE: Fewer extra edits may be retained than value specified for this setting if doing so would mean that more segments would be retained than the number configured by dfs.namenode.max.extra.edits.segments.retained.

同时提到了参数dfs.namenode.max.extra.edits.segments.retained,默认值是10000,解释为:

The maximum number of extra edit log segments which should be retained beyond what is minimally necessary for a NN restart. When used in conjunction with dfs.namenode.num.extra.edits.retained, this configuration property serves to cap the number of extra edits files to a reasonable value.

2. 高可用状态下主备切换有问题

可参考: https://blog.csdn.net/weixin_44455125/article/details/122524280

3. 阿里云的emr集群配置

阿里云的emr集群配置同样是shell方式,但是他的主节点数量支持两个.

不知道为啥,还在看.有知道的大佬劳烦给讲讲.

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/5709718.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-17
下一篇 2022-12-17

发表评论

登录后才能评论

评论列表(0条)

保存