在配置之前,我们需要建一个简单的表格,来标明每个虚拟机需要的配置
我的安排是把139,138,137作为主要需要配置的机器,136记录只jobhistory。
1.首先,先来配置zookeeper1.把压缩包解压到 /opt/soft/下,并且重新命名为:zookeeper345
2.在zookeeper345文件夹下,进入到/conf/文件夹下,可以看到有个zoo_sample.cfg文件
复制此文件到当前文件夹下并且命名为:zoo.cfg,打开并修改以下几处12行,添加29,30,31行
附上代码
1 # The number of milliseconds of each tick 2 tickTime=2000 3 # The number of ticks that the initial 4 # synchronization phase can take 5 initLimit=10 6 # The number of ticks that can pass between 7 # sending a request and getting an acknowledgement 8 syncLimit=5 9 # the directory where the snapshot is stored. 10 # do not use /tmp for storage, /tmp here is just 11 # example sakes. 12 dataDir=/opt/soft/zookeeper345/datatmp 13 # the port at which the clients will connect 14 clientPort=2181 15 # the maximum number of client connections. 16 # increase this if you need to handle more clients 17 #maxClientCnxns=60 18 # 19 # Be sure to read the maintenance section of the 20 # administrator guide before turning on autopurge. 21 # 22 # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_mai ntenance 23 # 24 # The number of snapshots to retain in dataDir 25 #autopurge.snapRetainCount=3 26 # Purge task interval in hours 27 # Set to "0" to disable auto purge feature 28 #autopurge.purgeInterval=1 29 server.1=192.168.153.139:2888:3888 30 server.2=192.168.153.138:2888:3888 31 server.3=192.168.153.137:2888:3888 32
3.回到zookeeper345文件夹下,创建一个datatmp文件夹,这一步骤正好和刚刚代码的12行对应
在datatmp文件夹下创建文本myid(这个名字会在后面有使用),编辑文本写个数字:1,这个文本里其他什么都不要乱写,与29行代码对应
4.配置环境
命令:cd /etc/profile
添加以下代码
export ZOOKEEPER_HOME=/opt/soft/zookeeper345 export PATH=$PATH:$ZOOKEEPER_HOME/bin
5.进行分发,因为我是在139机器上做的,所以我把zookeeper345分发到138和137和136。这里我在/usr/bin/目录下写了一个分发脚本为:xsync,也为了后面的方便,再次附上代码
#!/bin/bash #获取的参数,如果没有参数,直接退出 argCount=$# if [ $argCount == 0 ]; then echo 'no args' exit 0 fi #获取文件名称 f= fname=`basename $f` echo $fname #获取文件绝对路径 pdir=`cd -P $(dirname $f); pwd` echo $pdir #获取当前用户 user=`whoami` echo "$user" #循环拷贝 for host in gree136 gree137 gree138 do echo "**********$host*********" rsync -av $pdir/$fname $user@$host:$pdir done
执行下面的命令
source /etc/ptofile zkServer.sh start zkServer.sh status
6.查看是否可以启动,这边查看状态可能会暴一些问题,有一个原因是因为138和137没有配置好,所以下面我们只需要把/zookeeper345/datatmp/文件夹下面的myid改成与/conf/下的zoo.cfg的最后添加的server后的数字一致;再把/etc/profile文件配置好,这里当然也可以用脚本分发
xsync /etc/profile
7.到相应的及机器下,启动并且查看状态。这里我也在/bin/目录下写了脚本为:zkop,以便于在一台机器上查看其它机器的状态(在此之前已经做过免密登录)
#!/bin/bash for host in gree139 gree138 gree137 do case $* in "start"){ echo "********** $host zookeeper start*****************" ssh $host "source /etc/profile; zkServer.sh start" };; "stop"){ echo "********** $host zookeeper stop*******************" ssh $host "source /etc/profile; zkServer.sh stop" };; "status"){ echo "********** $host zookeeper status*******************" ssh $host "source /etc/profile; zkServer.sh status" };; esac done
8.查看状态
[root@gree139 conf]# zkop status ********** gree139 zookeeper status******************* JMX enabled by default Using config: /opt/soft/zookeeper345/bin/../conf/zoo.cfg Mode: follower ********** gree138 zookeeper status******************* JMX enabled by default Using config: /opt/soft/zookeeper345/bin/../conf/zoo.cfg Mode: leader ********** gree137 zookeeper status******************* JMX enabled by default Using config: /opt/soft/zookeeper345/bin/../conf/zoo.cfg Mode: follower
自此,zookeeper算是装好了。
2.下面我们来进行其他配置,解压hadoop压缩包,老样子放到/opt/soft/下面1.日常改名,原来的解压后的文件名太长,改为:hadoop260,
2,我们要修改hadoop-env.sh, yarn-env.sh, mapred-env.sh,core-site.xml,hdfs-site.xml,yarn-site.xml,把mapred-site.xml.template复制一份到当前文件夹,命名为:mapred-site.xml。我们主要就是配置这几个文件
(1)配置core-site.xml文件
[root@gree135 hadoop]# vi ./core-site.xml20 21 22 25 26 27fs.defaultFS 23hdfs://mycluster/ 2428 31 32 33hadoop.tmp.dir 29/opt/soft/hadoop260/hadooptmp/ 3034 37 38 39ha.zookeeper.quorum 35gree135:2181,gree136:2181,gree137:2181 3640 43 44hadoop.proxyuser.bigdata.hosts 41* 4245 48hadoop.proxyuser.bigdata.groups 46* 47
(2)配置hdfs-site.xml文件
[root@gree135 hadoop]# vi ./hdfs-site.xmldfs.replication 3 28 31 32 33dfs.nameservices 29mycluster 3034 37 38 39dfs.ha.namenodes.mycluster 35nn1,nn2 3640 43 44 45dfs.namenode.rpc-address.mycluster.nn1 41gree135:9000 4246 49 50 51 52dfs.namenode.http-address.mycluster.nn1 47gree135:50070 4853 56 57 58dfs.namenode.rpc-address.mycluster.nn2 54gree136:9000 5559 62 63 64dfs.namenode.http-address.mycluster.nn2 60gree136:50070 6165 68 69 70dfs.journalnode.edits.dir 66/opt/soft/hadoop260/journaldata 6771 74 75 76dfs.namenode.shared.edits.dir 72qjournal://gree135:8485;gree136:8485;gree137:8485/mycluster 7377 80 81 82dfs.ha.automatic-failover.enabled 78true 7983 86 87 88dfs.client.failover.proxy.provider.mycluster 84org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider 8589 95 96 97dfs.ha.fencing.methods 9091 sshfence 92 shell(/bin/true) 93 9498 101 102 103dfs.ha.fencing.ssh.private-key-files 99/root/.ssh/id_rsa 100104 107 108dfs.ha.fencing.ssh.connect-timeout 10530000 106109 112dfs.webhdfs.enabled 110true 111
(3)配置mapred-site.xml
[root@gree135 hadoop]# vi ./mapred-site.xml20 21 22 25 26 27mapreduce.framework.name 23yarn 2428 31 32 33mapreduce.jobhistory.address 29gree138:10020 3034 mapreduce.jobhistory.webapp.address 35gree138:19888 36
(4)配置yarn-site.xml
[root@gree136 hadoop]# vi ./yarn-site.xmlyarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 gree135 yarn.resourcemanager.hostname.rm2 gree136 yarn.resourcemanager.zk-address gree135:2181,gree136:2181,gree137:2181 yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
(5)修改hadoop-env.sh, yarn-env.sh, mapred-env.sh,分别在25行,23行,16行,把JAVA_HOME后的路径更改为jdk文件夹的路径:
export JAVA_HOME=/opt/soft/jdk180
3.在/hadoop260/etc/hadoop/文件夹下创建slaves文件,添加集群内的虚拟机的别名,地址
4.把/Hadoop260/这个文件夹分发到其他虚拟机(脚本运行)
xsync ./hadoop260/
5.在每台机器上配置环境
vi /etc/profile #hadoop export HADOOP_HOME=/opt/soft/hadoop260 export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
6.启动zookeeper集群,查看状态
[root@gree139 conf]# zkop status ********** gree139 zookeeper status******************* JMX enabled by default Using config: /opt/soft/zookeeper345/bin/../conf/zoo.cfg Mode: follower ********** gree138 zookeeper status******************* JMX enabled by default Using config: /opt/soft/zookeeper345/bin/../conf/zoo.cfg Mode: leader ********** gree137 zookeeper status******************* JMX enabled by default Using config: /opt/soft/zookeeper345/bin/../conf/zoo.cfg Mode: follower
7。启动journalnode (集群内的都起来)
hadoop-daemon.sh start journalnode
看是否有QuorumPeerMain和JournalNode,如果缺少建议查看/Hadoop/文件夹下的hdfs.sitexml是否配置正确。具体问题具体分析。
8.格式化 namenode,因为我们139和138是主备的关系,因此只格式化一台,并且同步hadooptmp
[root@gree135 soft]# hadoop namenode -format 将135格式化后的hadooptmp文件同步到gree136 [root@gree135 hadoop260]# scp -r ./hadooptmp/ root@gree136:/opt/soft/hadoop260/
9.初始化zookeeper
hdfs zkfc -formatZK
10.启动HDFS
start-dfs.sh
11.启动yarn,查看最终状态
[root@gree139 hadoop]#start-yarn.sh [root@gree139 hadoop]#jqop jps *********gree139 指令信息********** jps 3344 NameNode 3920 ResourceManager 4016 NodeManager 3458 DataNode 3013 JournalNode 2908 QuorumPeerMain 3759 DFSZKFailoverController 5279 Jps *********gree138 指令信息********** jps 2928 DFSZKFailoverController 2552 QuorumPeerMain 3129 NodeManager 2667 JournalNode 2845 DataNode 2782 NameNode 11950 Jps *********gree137 指令信息********** jps 2448 QuorumPeerMain 2834 NodeManager 2662 DataNode 2557 JournalNode 3197 Jps *********gree136 指令信息********** jps 3361 Jps 2835 NodeManager 3014 JobHistoryServer 2684 DataNode
12.配置结束,如果有不对的地方欢迎指正
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)