- hostname:hadoop01,内存8g
- hostname:hadoop02,内存4g
- hostname:hadoop03,内存4g
- 命令:hostnamectl set-hostname hadoopx,修改为对应的hadoopx
- 或者vim /etc/hostname
- reboot一下
- 命令:vim /etc/hosts,追加:ip hostname
192.168.10.141 hadoop01 192.168.10.142 hadoop02 192.168.10.143 hadoop034、创建工作目录
- 命令:mkdir /home/atlas
- 查找openjdk,命令:rpm -qa | grep java
- 卸载除noarch后缀名以外的所有jdk,命令:rpm -e --nodeps xx
- 上传安装包并解压
- vim /etc/profile.d/my_env.sh
export JAVA_HOME=/home/atlas/jdk1.8 export PATH=$PATH:$JAVA_HOME/bin
- source /etc/profile
目的:将hadoop01上的文件同步到hadoop02以及hadoop03上
1、编写脚本- vim xsync
#1. 判断参数个数 if [ $# -lt 1 ] then echo Not Enough Arguement! exit; fi #2. 遍历集群所有机器 for host in hadoop01 hadoop02 hadoop03 do echo ==================== $host ==================== #3. 遍历所有目录,挨个发送 for file in $@ do #4. 判断文件是否存在 if [ -e $file ] then #5. 获取父目录 pdir=$(cd -P $(dirname $file); pwd) #6. 获取当前文件的名称 fname=$(basename $file) ssh $host "mkdir -p $pdir" rsync -av $pdir/$fname $host:$pdir else echo $file does not exists! fi done done2、赋予执行权限
- 命令:chmod +x xsync
- 命令:ssh localhost,若出现Host key verification failed.失败提示,则使用:ssh -o StrictHostKeyChecking=no localhost后,输入密码,再exit退出,之后再使用ssh local
- 命令:ssh-keygen -t rsa,之后三次回车即可
- 命令:
ssh-copy-id hadoop01 ssh-copy-id hadoop02 ssh-copy-id hadoop03五、为三台虚拟机安装hadoop 1、在hadoop01上安装hadoop
- 上传hadoop压缩包并解压
- 配置环境变量,vim /etc/profile.d/my_env.sh
export HADOOP_HOME=/home/atlas/hadoop-3.1.3 export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin
- 进入/home/atlas/hadoop-3.1.3/etc/hadoop目录对hadoop进行配置
- vim core-site.xml,将下面内容放入configuration标签里面
fs.defaultFS hdfs://hadoop01:8020 hadoop.tmp.dir /home/atlas/hadoop-3.1.3/data hadoop.http.staticuser.user atguigu hadoop.proxyuser.atguigu.hosts * hadoop.proxyuser.atguigu.groups * hadoop.proxyuser.atguigu.groups *
- vim hdfs-site.xml
dfs.namenode.http-address hadoop01:9870 dfs.namenode.secondary.http-address hadoop03:9868
- vim yarn-site.xml
yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.hostname hadoop02 yarn.nodemanager.env-whitelist JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME yarn.scheduler.minimum-allocation-mb 512 yarn.scheduler.maximum-allocation-mb 4096 yarn.nodemanager.resource.memory-mb 4096 yarn.nodemanager.pmem-check-enabled false yarn.nodemanager.vmem-check-enabled false yarn.log-aggregation-enable true yarn.log.server.url http://hadoop01:19888/jobhistory/logs yarn.log-aggregation.retain-seconds 604800
- vim mapred-site.xml
mapreduce.framework.name yarn mapreduce.jobhistory.address hadoop01:10020 mapreduce.jobhistory.webapp.address hadoop01:19888
- vim workers,删除local,新增
hadoop01 hadoop02 hadoop03
- 进入sbin目录,命令:cd /home/Atlas/hadoop-3.3.1/sbin,修改start-dfs.sh和stop-dfs.sh文件
HDFS_DATANODE_USER=root HDFS_DATANODE_SECURE_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root
- 修改start-yarn.sh和stop-yarn.sh文件
YARN_RESOURCEMANAGER_USER=root HDFS_DATANODE_SECURE_USER=yarn YARN_NODEMANAGER_USER=root2、使用xsync同步脚本将hadoop02与hadoop03同步配置
- 进入到工作目录,命令:cd /home/atlas/
- 同步hadoop文件夹,命令:./xsync hadoop-3.1.3
- 为hadoop02与hadoop03配置hadoop环境变量
- 进入工作目录,命令:cd /home/atlas/
- 编写启动脚本,命令:vim myhadoop.sh,并赋予执行权限,命令:chmod +x myhadoop.sh
if [ $# -lt 1 ] then echo "No Args Input..." exit ; fi case in "start") echo " =================== 启动 hadoop集群 ===================" echo " --------------- 启动 hdfs ---------------" ssh hadoop01 "/home/atlas/hadoop-3.1.3/sbin/start-dfs.sh" echo " --------------- 启动 yarn ---------------" ssh hadoop02 "/home/atlas/hadoop-3.1.3/sbin/start-yarn.sh" echo " --------------- 启动 historyserver ---------------" ssh hadoop01 "/home/atlas/hadoop-3.1.3/bin/mapred --daemon start historyserver" ;; "stop") echo " =================== 关闭 hadoop集群 ===================" echo " --------------- 关闭 historyserver ---------------" ssh hadoop01 "/home/atlas/hadoop-3.1.3/bin/mapred --daemon stop historyserver" echo " --------------- 关闭 yarn ---------------" ssh hadoop02 "/home/atlas/hadoop-3.1.3/sbin/stop-yarn.sh" echo " --------------- 关闭 hdfs ---------------" ssh hadoop01 "/home/atlas/hadoop-3.1.3/sbin/stop-dfs.sh" ;; *) echo "Input Args Error..." ;; esac
- 第一次启动时需要在hadoop01上执行:hdfs namenode -format
- 使用脚本启动hadoop,命令:./myhadoop.sh start。需要关闭时,执行./myhadoop.sh stop
- 在hadoop01上执行jps
- 在hadoop02上执行jps
- 在hadoop03上执行jps
- 访问hadoop01上的网址,命令:http://192.168.10.141:9870/
- 访问hadoop02上的yarn网址,http://192.168.10.142:8088/
- 检查是否存在,命令:rpm -qa|grep mariadb
- 删除命令:rpm -e --nodeps mariadb-libs
- 上传压缩包并解压,并上传mysql的java连接驱动。解压至mysql文件夹中,解压命令:tar -xzvf mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar
- 进入mysql目录,执行命令
rpm -ivh mysql-community-common-5.7.28-1.el7.x86_64.rpm rpm -ivh mysql-community-libs-5.7.28-1.el7.x86_64.rpm rpm -ivh mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm rpm -ivh mysql-community-client-5.7.28-1.el7.x86_64.rpm rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm
- 清理旧环境,命令:cd /var/lib/mysql与rm -rf ./*
- 初始化数据库,命令:mysqld --initialize --user=mysql
- 查看临时生成的root 用户的密码,命令:cat /var/log/mysqld.log
- 启动mysql服务,命令:systemctl start mysqld
- 登录MySQL数据库,命令:mysql -uroot -p,之后输入之前的临时密码进入到数据库
- 修改密码,命令:set password = password("新密码");
- 修改mysql库下的user表中的root用户允许任意ip连接,命令1:update mysql.user set host='%' where user='root';,命令2:flush privileges;
- 上传并解压
- 重命名为hive,命令:mv apache-hive-3.1.2-bin/ hive/
- vim /etc/profile.d/my_env.sh
export HIVE_HOME=/home/atlas/hive export PATH=$PATH:$HIVE_HOME/bin3、将Hive元数据配置到MySQL
- 配置驱动,命令:cp /home/atlas/mysql/mysql-connector-java-5.1.37.jar $HIVE_HOME/lib
- 在$HIVE_HOME/conf目录下新建hive-site.xml文件,命令:vim $HIVE_HOME/conf/hive-site.xml
javax.jdo.option.ConnectionURL jdbc:mysql://hadoop01:3306/metastore?useSSL=false javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver javax.jdo.option.ConnectionUserName root javax.jdo.option.ConnectionPassword 970725 hive.metastore.schema.verification false hive.metastore.event.db.notification.api.auth false hive.metastore.warehouse.dir /user/hive/warehouse
- 登录mysql,命令:mysql -uroot -p
- 新建Hive元数据库后退出,命令:create database metastore;
- 初始化Hive元数据库,命令:schematool -initSchema -dbType mysql -verbose
- 为
- 启动Hive,命令:hive
- 新建一张表,命令
create table test_user2( `id` string comment '编号', `name` string comment '姓名', `province_id` string comment '省份ID', `province_name` string comment '省份名称' ) comment '用户表' ROW FORMAT DELIMITED FIELDS TERMINATED BY 't';
- 插入一条数据
insert into table test_user values('1','zhangshan','001','北京');
- 查看数据,命令:select * from test_user;
- 通过网页进入yarn,查看数据,网址:hadoop02的ip:8080
- Hive使用的库需要使用latin1字符集,不能直接对mysql整体修改为utf-8
- 在mysql中修改数据库metastore的编码,进入mysql后,使用metastore库,命令:use metastore,执行下列sql语句
#修改表字段注解和表注解 alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8; alter table TABLE_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8; #修改分区字段注解: alter table PARTITION_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8; alter table PARTITION_KEYS modify column PKEY_COMMENT varchar(4000) character set utf8; #修改索引注解: alter table INDEX_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8; #修改视图 alter table TBLS modify column view_expanded_text mediumtext character set utf8; alter table TBLS modify column view_original_text mediumtext character set utf8;
- 修改hive-site.xml
八、在三台机器上配置zookeeper 1、在hadoop01上安装zookeeperjavax.jdo.option.ConnectionURL jdbc:mysql://hadoop102:3306/metastore?useSSL=false&useUnicode=true&characterEncoding=UTF-8
- 上传压缩包并解压,解压为zookeeper文件夹
- 进入zookeeper文件夹,创建文件夹zkData,命令:mkdir zkData
- 进入zkData目录,创建文件,命令:vim myid,写入内容
#服务器编号,在hadoop01上为1,hadoop02上为2,hadoop03上为3 1
- 进入conf目录,命令:cd /home/atlas/zookeeper/conf
- 重命名zoo_sample.cfg文件为zoo.cfg,命令:mv zoo_sample.cfg zoo.cfg
- 修改zoo.cfg文件
#修改 dataDir=/home/atlas/zookeeper/zkData #文本末尾追加 #######################cluster########################## server.1=hadoop01:2888:3888 server.2=hadoop02:2888:3888 server.3=hadoop03:2888:38882、使用同步脚本同步zookeeper文件夹
- 进入工作目录,命令:cd /home/atlas/
- 使用同步脚本,命令:./xsync zookeeper
- 修改hadoop02与hadoop03的myid文件
- 编写文件,命令:vim zk.sh
if [ $# -lt 1 ] then echo "No Args Input..." exit ; fi case in "start"){ for i in hadoop01 hadoop02 hadoop03 do echo ---------- zookeeper $i 启动 ------------ ssh $i "/home/atlas/zookeeper/bin/zkServer.sh start" done };; "stop"){ for i in hadoop01 hadoop02 hadoop03 do echo ---------- zookeeper $i 停止 ------------ ssh $i "/home/atlas/zookeeper/bin/zkServer.sh stop" done };; "status"){ for i in hadoop01 hadoop02 hadoop03 do echo ---------- zookeeper $i 状态 ------------ ssh $i "/home/atlas/zookeeper/bin/zkServer.sh status" done };; esac
- 赋予执行权限,命令:chmod +x zk.sh
- 使用脚本,启动zookeeper,命令:./zk.sh start
- 解压Kafka,重命名文件为kafka
- 在kafka目录下创建logs文件夹 ,命令:mkdir logs
- 修改server.properties文件,命令:vim /home/atlas/kafka/config/server.properties
#修改broker.id,hadoop01为0,hadoop02为1,handoop03为2 broker.id=0 #删除topic 功能使能,追加在broker.id=0后面 delete.topic.enable=true #修改kafka运行日志存放的路径 log.dirs=/home/atlas/kafka/data #修改配置连接Zookeeper 集群地址 zookeeper.connect=hadoop01:2181,hadoop02:2181,hadoop03:2181/kafka
- 配置kafka环境变量,vim /etc/profile.d/my_env.sh
export KAFKA_HOME=/home/atlas/kafka export PATH=$PATH:$KAFKA_HOME/bin2、使用同步脚本同步kafka文件夹
- 命令:./xsync kafka
- 修改handoop02与handoop03上的server.properties文件
- 配置handoop02与handoop03上的kafka环境变量
- 编写文件,命令:vim kf.sh
if [ $# -lt 1 ] then echo "No Args Input..." exit ; fi case in "start"){ for i in hadoop01 hadoop02 hadoop03 do echo ---------- kafka $i 启动 ------------ ssh $i "/home/atlas/kafka/bin/kafka-server-start.sh -daemon /home/atlas/kafka/config/server.properties" done };; "stop"){ for i in hadoop01 hadoop02 hadoop03 do echo ---------- kafka $i 停止 ------------ ssh $i "/home/atlas/kafka/bin/kafka-server-stop.sh stop" done };; esac
- 赋予执行权限,命令:chmod +x kf.sh
- 使用脚本,启动zookeeper,命令:./kf.sh start
- 在hadoop01上执行zookeeper脚本,命令:/home/atlas/zookeeper/bin/zkCli.sh
- 执行后输入ls /kafka查看文件
- 上传压缩包并解压为Hbase文件夹
- 配置Hbase环境变量,命令:vim /etc/profile.d/my_env.sh
export Hbase_HOME=/home/atlas/hbase export PATH=$PATH:$Hbase_HOME/bin
- 修改hbase-env.sh文件,命令:vim /home/atlas/hbase/conf/hbase-env.sh
#修改 export Hbase_MANAGES_ZK=false #原来为true
- 修改hbase-site.xml文件,命令:vim /home/atlas/hbase/conf/hbase-site.xml
hbase.rootdir hdfs://hadoop01:8020/Hbase hbase.cluster.distributed true hbase.zookeeper.quorum hadoop01,hadoop02,hadoop03
- 修改regionservers文件,命令:vim /home/atlas/hbase/conf/regionservers
#删除localhost,追加 hadoop01 hadoop02 hadoop032、使用同步脚本同步hbase文件
- 命令:./xsync hbase
- 配置handoop02与handoop03上的hbase环境变量
- 使用hbase自带的脚本启动三台hbase,命令:/home/atlas/hbase/bin/start-hbase.sh。停止:/home/atlas/hbase/bin/stop-hbase.sh
- 在每台机器上使用命令jps,查看HMaster服务(hadoop01上)与HRegionServer服务(都有)
- 访问hadoop01的10610端口
- 创建用户,命令:useradd solr
- 设置密码,命令:echo 新密码 | passwd --stdin 账户名,这里我们使用:echo solr | passwd --stdin solr
- 上传压缩包并解压为solr文件夹
- 修改solr 目录的所有者为solr用户,命令:chown -R solr:solr /home/atlas/solr
- 修改/home/atlas/solr/bin/solr.in.sh文件,命令:vim /home/atlas/solr/bin/solr.in.sh
ZK_HOST="hadoop01:2181,hadoop02:2181,hadoop03:2181"3、使用同步脚本同步solr文件夹
- 命令:./xsync solr
- 分别在三台机器上执行命令:sudo -i -u solr /home/atlas/solr/bin/solr start,以solr用户启动solr
- 访问三台机器的8983端口
- 解压apache-atlas-2.1.0-server.tar.gz文件,重命名为atlas
- 修改atlas/conf/atlas-application.properties配置文件,命令:vim /home/atlas/atlas/conf/atlas-application.properties
atlas.graph.storage.hostname=hadoop01:2181,hadoop02:2181,hadoop03:2181
- 修改atlas/conf/atlas-env.sh 配置文件,命令:vim /home/atlas/atlas/conf/atlas-env.sh
#在文件最后追加 export Hbase_CONF_DIR=/home/atlas/hbase/conf3、Atlas集成Solr
- 修改atlas/conf/atlas-application.properties配置文件,命令:vim /home/atlas/atlas/conf/atlas-application.properties
atlas.graph.index.search.solr.zookeeper-url=hadoop01:2181,hadoop02:2181,hadoop03:2181
- 执行下列命令
sudo -i -u solr /home/atlas/solr/bin/solr create -c vertex_index -d /home/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 sudo -i -u solr /home/atlas/solr/bin/solr create -c edge_index -d /home/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 sudo -i -u solr /home/atlas/solr/bin/solr create -c fulltext_index -d /home/atlas/atlas/conf/solr -shards 3 -replicationFactor 2
- 验证:进入hadoop028983端口,点击cloud查看
- 修改atlas/conf/atlas-application.properties配置文件,命令:vim /home/atlas/atlas/conf/atlas-application.properties
atlas.notification.embedded=false atlas.kafka.data=/home/atlas/kafka/data atlas.kafka.zookeeper.connect=hadoop01:2181,hadoop02:2181,hadoop03:2181/kafka atlas.kafka.bootstrap.servers=hadoop01:9092,hadoop02:9092,hadoop03:90925、Atlas Server 配置
- 修改atlas/conf/atlas-application.properties配置文件,命令:vim /home/atlas/atlas/conf/atlas-application.properties
atlas.rest.address=http://hadoop01:21000 atlas.server.run.setup.on.start=false atlas.audit.hbase.zookeeper.quorum=hadoop01:2181,hadoop02:2181,hadoop03:2181
- 修改atlas-log4j.xml文件,命令:vim /home/atlas/atlas/conf/atlas-log4j.xml
#去掉下面代码的注释6、Atlas集成Hive
- 修改atlas/conf/atlas-application.properties配置文件,命令:vim /home/atlas/atlas/conf/atlas-application.properties
#在文件末尾追加 ######### Hive Hook Configs ####### atlas.hook.hive.synchronous=false atlas.hook.hive.numRetries=3 atlas.hook.hive.queueSize=10000 atlas.cluster.name=primary
- 修改hive-site.xml文件,命令:vim /home/atlas/hive/conf/hive-site.xml
#在configuration标签里追加7、安装Hive Hookhive.exec.post.hooks org.apache.atlas.hive.hook.HiveHook
- 解压Hive Hook,命令:tar -zxvf apache-atlas-2.1.0-hive-hook.tar.gz
- 将Hive Hook目录里的文件依赖复制到Atlas 安装路径,命令:cp -r atlas-hive-hook/* /home/atlas/atlas/
- 重命名hive-env.sh.template文件,命令:mv /home/atlas/hive/conf/hive-env.sh.template /home/atlas/hive/conf/hive-env.sh
- 修改hive/conf/hive-env.sh配置文件,命令:vim /home/atlas/hive/conf/hive-env.sh
export HIVE_AUX_JARS_PATH=/home/atlas/atlas/hook/hive
- 将Atlas 配置文件/home/atlas/atlas/conf/atlas-application.properties 拷贝到/home/atlas/hive/conf 目录,命令:cp /home/atlas/atlas/conf/atlas-application.properties /home/atlas/hive/conf/
- 启动Hadoop 集群
- 启动Zookeeper 集群
- 启动Kafka 集群
- 启动Hbase 集群
- 启动Solr 集群
- hadoop01上执行jps,有10个服务
- hadoop02上执行jps,有8个服务
- 在hadoop03上执行jps,有八个服务
- 进入atlas的bin目录,命令:cd /home/atlas/atlas/bin
- 执行启动脚本,命令:./atlas_start.py,等待2min
- 访问hadoop01的21000端口
- 使用默认账号登录,用户名:admin,密码:admin
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)