- 将源系统mysql表数据全量抽取到hive中作为ODS层,按天分区保留每天的数据
create table T_YYBZB_TGH_BANKINFO ( id int(8), bank_id int(8), bank_name varchar(200), source_date varchar(200) ); insert into T_YYBZB_TGH_BANKINFO (ID, BANK_ID, BANK_NAME)values (11, 11, '工商银行(广州)','20210101');1、创建hive目标表分区 1.1、编写hive分区表
create table ods.ods_t_yybzb_tgh_bankinfo_di ( id int, bank_id int, bank_name string ) partitioned by (`pt` string) row format delimited fields terminated by ',';1.2、配置SQL组件创建表
- 配置hive数据源,配置相关建表脚本
- 第一步:上线部署
- 第二步:运行调试
alter table ods.ods_t_yybzb_tgh_bankinfo_di drop if exists partition(p_dt=${pt}) alter table ods.ods_t_yybzb_tgh_bankinfo_di add if not exists partition (pt=${pt})2.2、配置SQL组件
(1)数据源:ODS
(2)sql类型:非查询
- 删除分区
- 创建分区
(3)参数传递:使用全局时间参数$[yyyyMMdd-1]
3、DataX导入数据 3.1、编写DataX相关Json脚本{ "job": { "content": [ { "reader": { "name": "mysqlreader", "parameter": { "connection": [ { "jdbcUrl": [ "jdbc:mysql://${ip}:${port}/${sid}?useSSL=false"], "querySql": ["select id,bank_id,bank_name from T_YYBZB_TGH_BANKINFO where pt=${p_dt}"], } ], "password": "${password}", "username": "${username}" } }, "writer": { "name": "hdfswriter", "parameter": { "defaultFS": "hdfs://${hdfs_ip}:${hdfs_port}", "fileType": "text", "path": "/user/hive/warehouse/ods.db/ods_t_yybzb_tgh_bankinfo_di/pt=${p_dt}", "fileName": "ods_t_yybzb_tgh_bankinfo_di", "column": [ {"name":"id","type":"int"}, {"name":"bank_id","type":"int"}, {"name":"bank_name","type":"string"} ], "writeMode": "append", "fieldDelimiter": "t", "encoding": "utf-8" } } }], "setting": { "speed": { "channel": "1" } } } }3.2、DataX组件配置
(1)自定义模板
(2)参数配置:使用全局参数$[yyyyMMdd-1]
- 工作流中增加定时,设置为每天6:00整执行
- 先上线部署,之后设置定时任务
- 运行结果通过工作流实例–>工作流名称查看
(1)问题现状
ERROR 1366 (HY000): Incorrect string value: 'xE5xB7xA5xE5x95x86...' for column 'bank_name' at row 1
(2)问题原因
- 字段的字符集lain
(3)解决方案
ALTER TABLE T_YYBZB_TGH_BANKINFO CHANGE bank_name bank_name TEXT CHARACTER SET utf8;5.2、dolphinscheduler时间参数相关
前N天:$[yyyyMMdd+N] 后N天:$[yyyyMMdd-N] 前N月:$[yyyyMM-1] 后N月:$[yyyyMM+1] 前N周:$[yyyyMMdd+7N] 后N周:$[yyyyMMdd-7N]
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)