DolphinScheduler简单部署_随笔

DolphinScheduler简单部署

DolphoinScheduler 1.3.2 已经发布许久，相比于文中的 1.2.0 版本新增了许多特性，比如支持 sqoop ，优化界面 UI 等。
DolphoinScheduler 1.3.2 详细文档请移步： https://www.yuque.com/docs/share/454e9a42-b6c7-44b2-9d29-1d5795199456?# 《DolphinScheduler - 1.3.2 document》
关于 1.3.2 的部署，建议参考文档中源码编译的方式进行部署

一、文档说明 1.1 DolphinScheduler 说明

Apache DolphinScheduler 是一个分布式去中心化，易扩展的可视化 DAG 工作流任务调度系统。
致力于解决数据处理流程中错综复杂的依赖关系，使调度系统在数据处理流程中开箱即用。

1.2 相关网址

官网：

https://dolphinscheduler.apache.org/zh-cn/index.html

Git 地址：

https://github.com/apache/incubator-dolphinscheduler

系统架构设计：

https://dolphinscheduler.apache.org/zh-cn/blog/architecture-design.html

FAQ：

https://dolphinscheduler.apache.org/zh-cn/docs/faq.html

1.3 名词解释

DAG：全称 Directed Acyclic Graph，简称 DAG。工作流中的 Task
任务以有向无环图的形式组装起来，从入度为零的节点进行拓扑遍历，直到无后继节点为止。
流程定义：通过拖拽任务节点并建立任务节点的关联所形成的可视化DAG
流程实例：流程定义的实例化，可以通过手动启动或定时调度生成，流程定义每运行一次，产生一个流程实例
任务实例：流程定义中任务节点的实例化，标识着具体的任务执行状态
任务类型：目前支持有SHELL、SQL、SUB_PROCESS(子流程)、PROCEDURE、MR、SPARK、PYTHON、DEPENDENT(依赖)，同时计划支持动态插件扩展，注意：其中子
SUB_PROCESS 也是一个单独的流程定义，是可以单独启动执行的
调度方式：系统支持基于 cron 表达式的定时调度和手动调度。
命令类型支持：启动工作流、从当前节点开始执行、恢复被容错的工作流、恢复暂停流程、从失败节点开始执行、补数、定时、重跑、暂停、停止、恢复等待线程。其中
恢复被容错的工作流和恢复等待线程两种命令类型是由调度内部控制使用，外部无法调用
定时调度：系统采用 quartz 分布式调度器，并同时支持 cron 表达式可视化的生成
依赖：系统不单单支持 DAG 简单的前驱和后继节点之间的依赖，同时还提供任务依赖节点，支持流程间的自定义任务依赖
优先级：支持流程实例和任务实例的优先级，如果流程实例和任务实例的优先级不设置，则默认是先进先出
邮件告警：支持 SQL任务查询结果邮件发送，流程实例运行结果邮件告警及容错告警通知
失败策略：对于并行运行的任务，如果有任务失败，提供两种失败策略处理方式，继续是指不管并行运行任务的状态，直到流程失败结束。结束
是指一旦发现失败任务，则同时Kill掉正在运行的并行任务，流程失败结束
补数：补历史数据，支持区间并行和串行两种补数方式

1.4 DolphinScheduler 架构

1.4.1 MasterServer

MasterServer 采用分布式无中心设计理念，MasterServer 主要负责 DAG 任务切分、任务提交监控，并同时监听其它 MasterServer 和 WorkerServer 的健康状态。 MasterServer 服务启动时向 Zookeeper 注册临时节点，通过监听 Zookeeper 临时节点变化来进行容错处理。
该服务内主要包含：

Distributed Quartz 分布式调度组件，主要负责定时任务的启停 *** 作，当 quartz 调起任务后，Master
内部会有线程池具体负责处理任务的后续 *** 作
MasterSchedulerThread 是一个扫描线程，定时扫描数据库中的 command 表，根据不同的命令类型进行不同的业务 *** 作
MasterExecThread 主要是负责 DAG任务切分、任务提交监控、各种不同命令类型的逻辑处理
MasterTaskExecThread 主要负责任务的持久化

1.4.2 WorkerServer

WorkerServer 也采用分布式无中心设计理念，WorkerServer 主要负责任务的执行和提供日志服务。WorkerServer 服务启动时向 Zookeeper 注册临时节点，并维持心跳。
该服务包含：

FetchTaskThread 主要负责不断从 Task Queue
中领取任务，并根据不同任务类型调用TaskScheduleThread 对应执行器。
LoggerServer 是一个RPC服务，提供日志分片查看、刷新和下载等功能

1.4.3 ZooKeeper

ZooKeeper 服务，系统中的 MasterServer 和 WorkerServer 节点都通过 ZooKeeper 来进行集群管理和容错。另外系统还基于 ZooKeeper 进行事件监听和分布式锁。我们也曾经基于Redis实现过队列，不过我们希望 DolphinScheduler 依赖到的组件尽量地少，所以最后还是去掉了 Redis 实现。

1.4.4 Task Queue

提供任务队列的 *** 作，目前队列也是基于 Zookeeper 来实现。由于队列中存的信息较少，不必担心队列里数据过多的情况，实际上我们压测过百万级数据存队列，对系统稳定性和性能没影响。

1.4.5 alert

提供告警相关接口，接口主要包括告警两种类型的告警数据的存储、查询和通知功能。其中通知功能又有邮件通知和SNMP(暂未实现)两种。

1.4.6 API

API接口层，主要负责处理前端UI层的请求。该服务统一提供 RESTful api 向外部提供请求服务。接口包括工作流的创建、定义、查询、修改、发布、下线、手工启动、停止、暂停、恢复、从该节点开始执行等等。

1.4.7 UI

系统的前端页面，提供系统的各种可视化 *** 作界面。

二、集群规划 2.1 集群配置

略

2.2 软件版本软件版本CDHCloudera 6.2.0dolphinscheduler1.2.0 2.3 集群规划

注：

以下所有节点均已部署 CDH 版本大数据相关组件。
若为 Apache 版本，则需要将大数据组件的环境变量设为全局，或者在各个租户下添加环境变量和线上环境的配置参数，防止出现 sudo -u
$tenant 无法调用大数据组件的情况。

hostnameMasterServerWorkerServer/LoggerServeralertServerApiServerUI10.30.64.240√10.30.64.241√√10.30.64.242√√√√√ 三、环境准备 3.1 基础软件准备(必装项请自行安装)

Mysql (5.5+) : 必装
JDK (1.8+) : 必装
ZooKeeper (3.4.6+) ：必装
Hadoop (2.6+) ：选装， 如果需要使用到资源上传功能，MapReduce任务提交则需要配置Hadoop(上传的资源文件目前保存在Hdfs上)
Hive(1.2.1) : 选装，hive任务提交需要安装
Spark (1.x,2.x) : 选装，Spark任务提交需要安装
PostgreSQL (8.2.15+) : 选装，PostgreSQL PostgreSQL存储过程需要安装
注意：DolphinScheduler 本身不依赖 Hadoop、Hive、Spark、PostgreSQL,仅是会调用他们的 Client，用于对应任务的运行。

3.2 pip、kazoo 安装

在主服务器（下发DolphinScheduler的机器）上执行以下 *** 作：

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
sudo python get-pip.py
pip --version
pip install kazoo

3.3 创建部署用户

在所有部署服务器上创建部署用户，并配置 sudo 权限（worker 服务是以 sudo -u {linux-user} 方式来执行作业）

# 创建部署用户
userdel -r dscheduler 
useradd dscheduler && echo dscheduler | passwd --stdin dscheduler
# 赋予 sudo 权限
chmod 640 /etc/sudoers
vim /etc/sudoers
# 大概在100行，在root下添加如下
dscheduler  ALL=(ALL)       NOPASSWD: NOPASSWD: ALL
# 并且需要注释掉 Default requiretty 一行。如果有则注释，没有没有跳过
#Default requiretty

3.4 对部署用户配置免密

dolphinscheduler 一键部署原理：在主机器（下载安装包的机器）修改好配置文件，通过 scp 方式将后端安装包发送到各个机器，并通过 ssh 方式在部署机器上启动相关服务。故此处，需要给主机器上的部署用户（dscheduler）配置到各个服务器的部署用户（dscheduler）的免密权限。

su - dscheduler 
ssh-keygen -t rsa
cd ~/.ssh && cp id_rsa.pub authorized_keys
chmod 700 authorized_keys
#ssh-copy-id hostname
ssh-copy-id localhost

3.5 dolphinscheduler 安装包下载

在主服务器上执行以下 *** 作：

# 创建安装目录
#sudo mkdir /u01/dolphinscheduler && sudo chown -R dscheduler:dscheduler /u01/dolphinscheduler && sudo ln -s /u01/dolphinscheduler /opt/dolphinscheduler
sudo mkdir /opt/dolphinscheduler && sudo chown -R dscheduler:dscheduler /opt/dolphinscheduler 

# 下载后端安装包（dolphinscheduler-backend）
wget http://mirror.bit.edu.cn/apache/incubator/dolphinscheduler/1.2.0/apache-dolphinscheduler-incubating-1.2.0-dolphinscheduler-backend-bin.tar.gz -P /opt/dolphinscheduler
# 下载前端安装包（dolphinscheduler-ui）
wget http://mirror.bit.edu.cn/apache/incubator/dolphinscheduler/1.2.0/apache-dolphinscheduler-incubating-1.2.0-dolphinscheduler-front-bin.tar.gz -P /opt/dolphinscheduler

四、软件部署 4.1 为 dolphinscheduler 创建 Mysql 数据库

CREATE DATAbase dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dscheduler'@'10.10.7.%' IDENTIFIED BY 'Ds@12345';
#GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dscheduler'@'10.158.1.%' IDENTIFIED BY 'Ds@12345';
#drop user dscheduler@'%';
flush privileges;

4.2 解压 dolphinscheduler 安装包 4.2.1 dolphinscheduler-backend

cd /opt/dolphinscheduler && tar -zxf apache-dolphinscheduler-incubating-1.2.0-dolphinscheduler-backend-bin.tar.gz
ln -s apache-dolphinscheduler-incubating-1.2.0-dolphinscheduler-backend-bin dolphinscheduler-backend

# 目录介绍
cd dolphinscheduler-backend && tree -L 1
.
├── bin           # 基础服务启动脚本
├── conf          # 项目配置文件
├── DISCLAIMER-WIP# DISCLAIMER文件
├── install.sh    # 一键部署脚本
├── lib           # 项目依赖jar包，包括各个模块jar和第三方jar
├── LICENSE       # LICENSE文件
├── licenses      # 运行时license
├── NOTICE        # NOTICE文件
├── script        # 集群启动、停止和服务监控启停脚本
└── sql           # 项目依赖sql文件

4.2.2 dolphinscheduler-ui

cd /opt/dolphinscheduler && tar -zxf apache-dolphinscheduler-incubating-1.2.0-dolphinscheduler-front-bin.tar.gz
ln -s apache-dolphinscheduler-incubating-1.2.0-dolphinscheduler-front-bin dolphinscheduler-front

4.3 dolphinscheduler-backend 部署 4.3.1 数据库配置

1.修改配置文件

vim /opt/dolphinscheduler/dolphinscheduler-backend/conf/application-dao.properties

 # postgre
 #spring.datasource.driver-class-name=org.postgresql.Driver
 #spring.datasource.url=jdbc:postgresql://192.168.xx.xx:5432/dolphinscheduler
 # mysql
 spring.datasource.driver-class-name=com.mysql.jdbc.Driver
 spring.datasource.url=jdbc:mysql://10.10.7.209:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8
 spring.datasource.username=dscheduler
 spring.datasource.password=Ds@12345

2.添加 mysql 驱动

 cp /usr/share/java/mysql-connector-java.jar /opt/dolphinscheduler/dolphinscheduler-backend/lib
 或
 cd /opt/dolphinscheduler && wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz
 tar zxvf mysql-connector-java-5.1.46.tar.gz
 cp mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar /opt/dolphinscheduler/dolphinscheduler-backend/lib

4.3.2 初始化数据库

sh /opt/dolphinscheduler/dolphinscheduler-backend/script/create-dolphinscheduler.sh
# create dolphinscheduler success -> 表示数据库初始化成功

4.3.3 修改环境变量配置

vim /opt/dolphinscheduler/dolphinscheduler-backend/conf/env/.dolphinscheduler_env.sh

# ==========
# CDH 版
# ==========
export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
export HADOOP_CONF_DIR=/opt/cloudera/parcels/CDH/lib/hadoop/etc/hadoop
export SPARK_HOME1=/opt/cloudera/parcels/CDH/lib/spark
export SPARK_HOME2=/opt/cloudera/parcels/CDH/lib/spark
export PYTHON_HOME=/usr/bin/python
export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
export Flink_HOME=/opt/soft/flink
export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH:$Flink_HOME/bin:$PATH

4.3.4 修改集群部署配置

cp /opt/dolphinscheduler/dolphinscheduler-backend/install.sh /opt/dolphinscheduler/dolphinscheduler-backend/install.sh_b
vim /opt/dolphinscheduler/dolphinscheduler-backend/install.sh

# 注：以下参数仅为核心部分配置，并未包含 install.sh 脚本全部内容
......................................................
source ${workDir}/conf/config/run_config.conf
source ${workDir}/conf/config/install_config.conf

# 1. 数据库配置
# ${installPath}/conf/quartz.properties
#dbtype="postgresql"
dbtype="mysql"
dbhost="10.10.7.209"
dbname="dolphinscheduler"
username="dscheduler"
# Note: if there are special characters, please use the  transfer character to transfer
passowrd="Ds@12345"

# 2. 集群部署环境配置
# ${installPath}/conf/config/install_config.conf
installPath="/opt/dolphinscheduler/dolphinscheduler-agent"
# deployment user
# Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
deployUser="dscheduler"
# zk cluster
zkQuorum="test01:2181,test02:2181,test03:2181"
# install hosts
ips="test01,test02,test03"

# 3. 各节点服务配置
# ${installPath}/conf/config/run_config.conf
# run master machine
masters="test02,test03"
# run worker machine
workers="test01,test02,test03"
# run alert machine
alertServer="test03"
# run api machine
apiServers="test03"


# 4. alert 配置
# ${installPath}/conf/alert.properties
# 若公司未开启 SSL 服务，可设置: mailServerPort="25" ; starttlsEnable="false" ; sslEnable="false"
# mail protocol
mailProtocol="SMTP"
# mail server host
mailServerHost="smtp.sohh.cn"
# mail server port
mailServerPort="465"
# sender
mailSender="dashuju@sohh.cn"
# user
mailUser="dashuju@sohh.cn"
# sender password
mailPassword="dashuju@123"
# TLS mail protocol support
starttlsEnable="false"
sslTrust="*"
# SSL mail protocol support
# note: The SSL protocol is enabled by default. 
# only one of TLS and SSL can be in the true state.
sslEnable="true"
# download excel path
xlsFilePath="/tmp/xls"
# Enterprise WeChat Enterprise ID Configuration
enterpriseWechatCorpId="xxxxxxxxxx"
# Enterprise WeChat application Secret configuration
enterpriseWechatSecret="xxxxxxxxxx"
# Enterprise WeChat Application AgentId Configuration
enterpriseWechatAgentId="xxxxxxxxxx"
# Enterprise WeChat user configuration, multiple users to , split
enterpriseWechatUsers="xxxxx,xxxxx"
# alert port
alertPort=7789

# 5. 开启监控自启动脚本
# 控制是否启动自启动脚本(监控master,worker状态,如果掉线会自动启动) 
# whether to start monitoring self-starting scripts
monitorServerState="true"

# 6. 资源中心配置
# ${installPath}/conf/common/ 中
# resource Center upload and select storage method：HDFS,S3,NONE
resUploadStartupType="HDFS"
# if resUploadStartupType is HDFS，defaultFS write namenode address，HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
# if S3，write S3 address，HA，for example ：s3a://dolphinscheduler，
# Note，s3 be sure to create the root directory /dolphinscheduler
defaultFS="hdfs://stcluster:8020"

# if S3 is configured, the following configuration is required.
s3Endpoint="http://192.168.xx.xx:9010"
s3AccessKey="xxxxxxxxxx"
s3SecretKey="xxxxxxxxxx"

# resourcemanager HA configuration, if it is a single resourcemanager, here is yarnHaIps=""
yarnHaIps="test03,test02"
# if it is a single resourcemanager, you only need to configure one host name. If it is resourcemanager HA, the default configuration is fine.
singleYarnIp="ark1"

# hdfs root path, the owner of the root path must be the deployment user. 
# versions prior to 1.1.0 do not automatically create the hdfs root directory, you need to create it yourself.
hdfsPath="/dolphinscheduler"
# have users who create directory permissions under hdfs root path /
# Note: if kerberos is enabled, hdfsRootUser="" can be used directly.
hdfsRootUser="hdfs"

# 7. common 配置
# ${installPath}/conf/common/common.properties 中
# common config
# Program root path
programPath="/tmp/dolphinscheduler"
# download path
downloadPath="/tmp/dolphinscheduler/download"
# task execute path
execPath="/tmp/dolphinscheduler/exec"
# SHELL environmental variable path
shellEnvPath="$installPath/conf/env/.dolphinscheduler_env.sh"
# suffix of the resource file
resSuffixs="txt,log,sh,conf,cfg,py,java,sql,hql,xml"
# development status, if true, for the SHELL script, you can view the encapsulated SHELL script in the execPath directory. 
# If it is false, execute the direct delete
devState="true"
# kerberos config
# kerberos whether to start
kerberosStartUp="false"
# kdc krb5 config file path
krb5ConfPath="$installPath/conf/krb5.conf"
# keytab username
keytabUserName="hdfs-mycluster@ESZ.COM"
# username keytab path
keytabPath="$installPath/conf/hdfs.headless.keytab"

# 8. zk 配置
# ${installPath}/conf/zookeeper.properties
# zk config
# zk root directory
zkRoot="/dolphinscheduler"
# used to record the zk directory of the hanging machine
zkDeadServers="$zkRoot/dead-servers"
# masters directory
zkMasters="$zkRoot/masters"
# workers directory
zkWorkers="$zkRoot/workers"
# zk master distributed lock
mastersLock="$zkRoot/lock/masters"
# zk worker distributed lock
workersLock="$zkRoot/lock/workers"
# zk master fault-tolerant distributed lock
mastersFailover="$zkRoot/lock/failover/masters"
# zk worker fault-tolerant distributed lock
workersFailover="$zkRoot/lock/failover/workers"
# zk master start fault tolerant distributed lock
mastersStartupFailover="$zkRoot/lock/failover/startup-masters"
# zk session timeout
zkSessionTimeout="300"
# zk connection timeout
zkConnectionTimeout="300"
# zk retry interval
zkRetrySleep="100"
# zk retry maximum number of times
zkRetryMaxtime="5"

# 9. master config
# ${installPath}/conf/master.properties
# master execution thread maximum number, maximum parallelism of process instance
masterExecThreads="100"
# the maximum number of master task execution threads, the maximum degree of parallelism for each process instance
masterExecTaskNum="20"
# master heartbeat interval
masterHeartbeatInterval="10"
# master task submission retries
masterTaskCommitRetryTimes="5"
# master task submission retry interval
masterTaskCommitInterval="100"
# master maximum cpu average load, used to determine whether the master has execution capability
masterMaxCpuLoadAvg="10"
# master reserve memory to determine if the master has execution capability
masterReservedMemory="1"
# master port
masterPort=5566

# 10. worker config
# ${installPath}/conf/worker.properties
# worker execution thread
workerExecThreads="100"
# worker heartbeat interval
workerHeartbeatInterval="10"
# worker number of fetch tasks
workerFetchTaskNum="3"
# worker reserve memory to determine if the master has execution capability
workerReservedMemory="1"
# master port
workerPort=7788

# 11. api config
# ${installPath}/conf/application.properties
# api server port
apiServerPort="12345"
# api session timeout
apiServerSessionTimeout="7200"
# api server context path
apiServerContextPath="/dolphinscheduler/"
# spring max file size
springMaxFileSize="1024MB"
# spring max request size
springMaxRequestSize="1024MB"
# api max http post size
apiMaxHttpPostSize="5000000"

# 1,replace file
echo "1,replace file"
......................................................

4.3.5 添加 Hadoop 配置文件

# 若 install.sh 中，resUploadStartupType 为 HDFS，且配置为 HA，则需拷贝 hadoop 配置文件到 conf 目录下
cp /etc/hadoop/conf.cloudera.yarn/hdfs-site.xml /opt/dolphinscheduler/dolphinscheduler-backend/conf/
cp /etc/hadoop/conf.cloudera.yarn/core-site.xml /opt/dolphinscheduler/dolphinscheduler-backend/conf/

# 若需要修改 hadoop 配置文件，则需拷贝 hadoop 配置文件到 $installPath/conf 目录下，并重启 api-server 服务
#cp /etc/hadoop/conf.cloudera.yarn/hdfs-site.xml /opt/dolphinscheduler/dolphinscheduler-agent/conf/
#cp /etc/hadoop/conf.cloudera.yarn/core-site.xml /opt/dolphinscheduler/dolphinscheduler-agent/conf/
#sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh start api-server
#sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh stop api-server

4.3.6 一键部署

执行脚本部署并启动

sh /opt/dolphinscheduler/dolphinscheduler-backend/install.sh

查看日志

tree /opt/dolphinscheduler/dolphinscheduler/logs
-------------------------------------------------
/opt/DolphinScheduler/dolphinscheduler/logs
├── dolphinscheduler-alert.log
├── dolphinscheduler-alert-server-node-b.test.com.out
├── dolphinscheduler-alert-server.pid
├── dolphinscheduler-api-server-node-b.test.com.out
├── dolphinscheduler-api-server.log
├── dolphinscheduler-api-server.pid
├── dolphinscheduler-logger-server-node-b.test.com.out
├── dolphinscheduler-logger-server.pid
├── dolphinscheduler-master.log
├── dolphinscheduler-master-server-node-b.test.com.out
├── dolphinscheduler-master-server.pid
├── dolphinscheduler-worker.log
├── dolphinscheduler-worker-server-node-b.test.com.out
├── dolphinscheduler-worker-server.pid
└── {processDefinitionId}
    └── {processInstanceId}
        └── {taskInstanceId}.log

查看Java进程

jps
8138 MasterServer              # master服务
8165 WorkerServer              # worker服务
8206 LoggerServer              # logger服务
8240 alertServer               # alert服务
8274 ApiApplicationServer      # api服务

Worker 启动失败

less /opt/dolphinscheduler/dolphinscheduler-agent/logs/dolphinscheduler-worker-server-test01.out
#nohup: 无法运行命令"/bin/java": 没有那个文件或目录

#解决方法：创建 java 软链
cd /usr/bin/ && sudo ln -s /usr/java/jdk1.8.0_181-cloudera/bin/java /usr/bin/java

4.3.7 指令

# 一键部署（含暂停、重发安装包、启动等 *** 作）
sh /opt/dolphinscheduler/dolphinscheduler-backend/install.sh

# 一键启停集群所有服务
sh /opt/dolphinscheduler/dolphinscheduler-backend/bin/start-all.sh
sh /opt/dolphinscheduler/dolphinscheduler-backend/bin/stop-all.sh
或
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/start-all.sh
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/stop-all.sh

# 启停 Master
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh start master-server
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh stop master-server

# 启停 Worker
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh start worker-server
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh stop worker-server

# 启停 Api
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh start api-server
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh stop api-server

# 启停 Logger
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh start logger-server
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh stop logger-server

# 启停alert
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh start alert-server
sh /opt/dolphinscheduler/dolphinscheduler-agent/bin/dolphinscheduler-daemon.sh stop alert-server

4.3.8 数据库升级（略）

# 数据库升级是在1.0.2版本增加的功能,执行以下命令即可自动升级数据库
sh /opt/dolphinscheduler/dolphinscheduler-agent/script/upgrade_dolphinscheduler.sh

4.4 dolphinscheduler-ui 部署 4.4.1 dolphinscheduler-ui 部署说明

在部署 ApiApplicationServer 的服务器上部署 UI 服务。
前端部署分自动和手动两种方式：

自动部署脚本会用 yum 安装 Nginx，通过引导设置后的 Nginx 配置文件为
/etc/nginx/conf.d/dolphinscheduler.conf
如果本地已经存在 Nginx，则需手动部署，创建 Nginx 配置文件
/etc/nginx/conf.d/dolphinscheduler.conf

4.4.2 自动部署

sudo sh /opt/dolphinscheduler/dolphinscheduler-front/install-dolphinscheduler-ui.sh

············
请输入nginx代理端口，不输入，则默认8888 :8886
请输入api server代理ip,必须输入，例如：192.168.xx.xx :10.10.7.209
请输入api server代理端口,不输入，则默认12345 :12345
=================================================
1.CentOS6安装
2.CentOS7安装
3.Ubuntu安装
4.退出
=================================================
请输入安装编号(1|2|3|4)：2
············ 
Complete!
port option is needed for add
FirewallD is not running
setenforce: SELinux is disabled
请浏览器访问：http://10.10.7.209:8886

4.4.3 手动部署

vim /etc/nginx/conf.d/dolphinscheduler.conf

    server {
        listen       8886;# access port
        server_name  localhost;
        #charset koi8-r;
        #access_log  /var/log/nginx/host.access.log  main;
        location / {
        root   /opt/dolphinscheduler/dolphinscheduler-front/dist; # static file directory
        index  index.html index.html;
        }
        location /dolphinscheduler {
        proxy_pass http://10.10.7.209:12345; # interface address
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header x_real_ipP $remote_addr;
        proxy_set_header remote_addr $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_http_version 1.1;
        proxy_connect_timeout 300s;
        proxy_read_timeout 300s;
        proxy_send_timeout 300s;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection upgrade;
        }
        #error_page  404              /404.html;
        # redirect server error pages to the static page /50x.html
        #
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
        root   /usr/share/nginx/html;
        }
    }

4.4.4 修改上传文件大小限制

sudo vim /etc/nginx/nginx.conf

# 在 http 内加入
client_max_body_size 1024m;

重启 nginx 服务

systemctl restart nginx

4.4.5 dolphinscheduler 首次登录

访问 http://10.10.7.209:8886  
初始用户：admin  
初始密码：dolphinscheduler123  
注：若访问网址提示 404，则删除 /etc/nginx/conf.d/default.conf 文件

4.4.6 Nginx 相关 4.4.6.1 CentOS7 安装 Nginx

rpm -Uvh http://nginx.org/packages/centos/7/noarch/RPMS/nginx-release-centos-7-0.el7.ngx.noarch.rpm
yum install nginx
systemctl start nginx.service

4.4.6.2 Nginx 指令

# 启动
systemctl start nginx
# 重启
systemctl restart nginx
# 状态
systemctl status nginx
# 停止
systemctl stop nginx

五、使用与测试 5.1 安全中心（Security） 5.1.1 队列管理（Queue manage）

说明：队列是在执行 spark、mapreduce 等程序，需要用到“队列”参数时使用的（创建后不可删除）。
详见：附录.队列管理

例：

安全中心 -> 队列管理 -> 创建队列
------------------------------------------------------
名称：quene_test
队列值：quene_test
------------------------------------------------------
提交

5.1.2 租户管理（Tenant Manage）

说明：
租户对应的是 Linux 的用户，用于 worker 提交作业所使用的用户。
如果 Linux 没有这个用户，worker 会在执行脚本的时候创建这个用户。
租户编码：
租户编码是 Linux 上的用户，唯一，不能重复。
新建的租户会在 HDFS 上 $hdfsPath("/dolphinscheduler"）目录下创建租户目录，租户目录下为该租户上传的文件和 UDF 函数
租户名称：
租户编码的别名

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5619082.html

DolphinScheduler简单部署

发表评论

评论列表（0条）