1. Sqoop 安装Sqoop 数据迁移工具
转换成 MapReduce 程序 执行,只有Mapper任务
SqlToHadoop InputFormat OutputFormat
- 下载并解压安装包
- 修改配置文件
sqoop-env.sh
#Set path to where bin/hadoop is available export HADOOP_COMMON_HOME=/opt/module/hadoop-2.7.1 #Set path to where hadoop-*-core.jar is available export HADOOP_MAPRED_HOME=/opt/module/hadoop-2.7.1 #set the path to where bin/hbase is available export Hbase_HOME=/opt/module/hbase-1.3.1 #Set the path to where bin/hive is available export HIVE_HOME=/opt/module/hive-3.1.2 #Set the path for where zookeper config dir is export ZOOCFGDIR=/opt/module/zookeeper-3.5.7
- 拷贝jdbc驱动包到lib下
- 启动测试
bin/sqoop help # connect mysql bin/sqoop list-databases --connect jdbc:mysql://hadoop102:3306 --username root --password 0000002. Sqoop 使用 2.1 Ipmoort 2.1.1 MySQL -> HDFS
- 全量导入
bin/sqoop import --connect jdbc:mysql://hadoop102:3306/datax --username root --password 000000 --table test1 --target-dir /sqoop/import/test1 --delete-target-dir --as-parquetfile -m 1
- 查询导入
bin/sqoop import --connect jdbc:mysql://hadoop102:3306/datax --username root --password 000000 --target-dir /sqoop/import/test2 --delete-target-dir -m 1 -e 'select id,name from test1 where id > 1001 and $CONDITIONS' --fields-terminated-by ','
- 增量导入
– where 指定为每天更新的时间字段即可
bin/sqoop import --connect jdbc:mysql://hadoop102:3306/datax --username root --password 000000 --table test1 --target-dir /sqoop/import/test3 --delete-target-dir --as-parquetfile -m 1 --where 'id=1001'2.1.2 MySQL -> Hive
# hive target table 会自动创建 bin/sqoop import --connect jdbc:mysql://hadoop102:3306/datax --username root --password 000000 --table test1 -m 1 --hive-import --hive-overwrite --create-hive-table -hive-table test.hive_test12.1.3 MySQL -> Hbase
# sqoop1.4.6 与 hbase1.3.1不兼容,不能自动创建表 bin/sqoop import --connect jdbc:mysql://hadoop102:3306/datax --username root --password 000000 --table test1 -m 1 --hbase-table hbase_test1 --column-family info --hbase-create-table --hbase-row-key id2.2 Export 2.2.1 HDFS -> MySQL
bin/sqoop export --connect jdbc:mysql://hadoop102:3306/datax --username root --password 000000 --table test1 -m 1 --export-dir /sqoop/import/test2 --input-fields-terminated-by ','3. Sqoop Shell
mysql2hdfs.opt
import --connect jdbc:mysql://hadoop102:3306/datax --username root --password 000000 --table test1 --target-dir /sqoop/import/test4 --delete-target-dir --as-parquetfile -m 1
bin/sqoop --options-file opt/mysql2hdfs.opt
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)