kettle如何添加新的数据库连接类型

kettle如何添加新的数据库连接类型,第1张

 建立文件资源库:点击工具->资源库->连接资源库菜单

使用文件资源库不需要用户名和密码,如果没有资源库可以点击右上角的"+"新建资源库,如下图:

第一种方式为建立数据库的资源库,本例选择使用文件资源库,选择确定后会要求选择文件资源库的路径,并给文件资源库设置一个ID和名称

下载pdi-ce-4.4.0-stable.zip,解压到文件夹,打开data-integration中的Spoon.bat

2

出现欢迎界面后来到Repository Connection窗口,选择建立一个新的repository,随后出现“资源库信息”窗口:

在“资源库信息”窗口中选择新建一个数据库连接,d出“Database Connection”窗口:

在其中输入Connection Name, Host Name, Database Name, Port Number, User Name,Password信息即可建立连接,完成之后在Repository Connection窗口以admin用户名登陆。

新建一个名为cscgTransTest的Transformation,从“核心对象”中将两个“表输入”和一个“插入/更新”拖入到cscgTransTest中,并建立它们之间的连接,如下图所示:

在cscgTransTest中建立一个新的数据库连接ttt,通过表输入“max_createtime”从目标数据库ttt中获取某个表中最新数据的建立时间:

SELECT max(trunc(createtime)) FROMumdata.toeventmedia

在cscgTransTest中建立一个新的数据库连接testdb,以表输入“max_createtime”的查询结果替代表输入“umdata.toeventmedia”中的变量,执行SQL语句从数据库testdb中获取需要插入或者更新到ttt数据库的数据

SELECT * FROMumdata.toeventmedia where trunc(createtime) >= trunc(?)

在“插入/更新”中选择“数据库连接”、“目标模式”、“目标表”等信息,“用来查询的关键字”中的字段用来查询某条记录是否在目标表中存在,不存在则插入记录;如果存在,则继续比较其他字段是否与流里的字段值相同,如果相同则不执行任何 *** 作,如果不同则更新“更新字段”中所列字段。

“用来查询的关键字”所列字段是该表的primarykey,从而可以唯一标识一条记录。

分别为每一个表建立一个如上模式的转换步骤。

新建一个名为“cscgJobTest”的Job,在核心对象中将“START”和“Transformation”拖入cscgJobTest中,并建立两者之间的连接。

选中START中的“重复执行”,类型为“不需要定时”;在Transformation中将转换名设置为之前建立的“cscgTransTest”.

点击“Run this Job”运行。Job和Transformation的执行结果如如下:

连接hive的方法:

进入hive所在的服务器,输入:hive --service hiveserver(目的:启动thrift)

打开kettle配置连接界面,输入hive所在服务器的ip、所需要的hive库、端口号(thrift默认端口为:10000)

测试连接,即可

连接hive2的方法:

[plain] view plain copy

Error connecting to database [Hive] : org.pentaho.di.core.exception.KettleDatabaseException:

Error occured while trying to connect to the database

Error connecting to database: (using class org.apache.hadoop.hive.jdbc.HiveDriver)

Unable to load Hive Server 2 JDBC driver for the currently active Hadoop configuration

org.pentaho.di.core.exception.KettleDatabaseException:

Error occured while trying to connect to the database

Error connecting to database: (using class org.apache.hadoop.hive.jdbc.HiveDriver)

Unable to load Hive Server 2 JDBC driver for the currently active Hadoop configuration

at org.pentaho.di.core.database.Database.normalConnect(Database.java:428)

at org.pentaho.di.core.database.Database.connect(Database.java:361)

at org.pentaho.di.core.database.Database.connect(Database.java:314)

at org.pentaho.di.core.database.Database.connect(Database.java:302)

at org.pentaho.di.core.database.DatabaseFactory.getConnectionTestReport(DatabaseFactory.java:80)

at org.pentaho.di.core.database.DatabaseMeta.testConnection(DatabaseMeta.java:2685)

at org.pentaho.di.ui.core.database.dialog.DatabaseDialog.test(DatabaseDialog.java:109)

at org.pentaho.di.ui.core.database.wizard.CreateDatabaseWizardPage2.test(CreateDatabaseWizardPage2.java:157)

at org.pentaho.di.ui.core.database.wizard.CreateDatabaseWizardPage2$3.widgetSelected(CreateDatabaseWizardPage2.java:147)

at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)

at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)

at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)

at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)

at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)

at org.eclipse.jface.window.Window.runEventLoop(Window.java:820)

at org.eclipse.jface.window.Window.open(Window.java:796)

at org.pentaho.di.ui.core.database.wizard.CreateDatabaseWizard.createAndRunDatabaseWizard(CreateDatabaseWizard.java:111)

at org.pentaho.di.ui.spoon.Spoon.createDatabaseWizard(Spoon.java:7457)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at java.lang.reflect.Method.invoke(Unknown Source)

at org.pentaho.ui.xul.impl.AbstractXulDomContainer.invoke(AbstractXulDomContainer.java:313)

at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:157)

at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke(AbstractXulComponent.java:141)

at org.pentaho.ui.xul.jface.tags.JfaceMenuitem.access$100(JfaceMenuitem.java:43)

at org.pentaho.ui.xul.jface.tags.JfaceMenuitem$1.run(JfaceMenuitem.java:106)

at org.eclipse.jface.action.Action.runWithEvent(Action.java:498)

at org.eclipse.jface.action.ActionContributionItem.handleWidgetSelection(ActionContributionItem.java:545)

at org.eclipse.jface.action.ActionContributionItem.access$2(ActionContributionItem.java:490)

at org.eclipse.jface.action.ActionContributionItem$5.handleEvent(ActionContributionItem.java:402)

at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)

at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)

at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)

at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)

at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1297)

at org.pentaho.di.ui.spoon.Spoon.waitForDispose(Spoon.java:7801)

at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:9130)

at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:638)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at java.lang.reflect.Method.invoke(Unknown Source)

at org.pentaho.commons.launcher.Launcher.main(Launcher.java:151)

Caused by: org.pentaho.di.core.exception.KettleDatabaseException:

Error connecting to database: (using class org.apache.hadoop.hive.jdbc.HiveDriver)

Unable to load Hive Server 2 JDBC driver for the currently active Hadoop configuration

at org.pentaho.di.core.database.Database.connectUsingClass(Database.java:573)

at org.pentaho.di.core.database.Database.normalConnect(Database.java:410)

... 43 more

Caused by: java.sql.SQLException: Unable to load Hive Server 2 JDBC driver for the currently active Hadoop configuration

at org.apache.hive.jdbc.HiveDriver.getActiveDriver(HiveDriver.java:107)

at org.apache.hive.jdbc.HiveDriver.callWithActiveDriver(HiveDriver.java:121)

at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:132)

at java.sql.DriverManager.getConnection(Unknown Source)

at java.sql.DriverManager.getConnection(Unknown Source)

at org.pentaho.di.core.database.Database.connectUsingClass(Database.java:555)

... 44 more

Caused by: java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at java.lang.reflect.Method.invoke(Unknown Source)

at org.apache.hive.jdbc.HiveDriver.getActiveDriver(HiveDriver.java:105)

... 49 more

Caused by: java.lang.RuntimeException: Unable to load JDBC driver of type: hive2

at org.pentaho.hadoop.shim.common.CommonHadoopShim.getJdbcDriver(CommonHadoopShim.java:108)

... 54 more

Caused by: java.lang.Exception: JDBC driver of type 'hive2' not supported

at org.pentaho.hadoop.shim.common.CommonHadoopShim.getJdbcDriver(CommonHadoopShim.java:104)

... 54 more

上述报错的解决方法如下:

1.找到%KETTLE_HOME%/plugins/pehtaho-big-data-plugin/plugin.properties文件

2.修改plugin.properties文件中的值:active.hadoop.configuration=hdp13

3.修改后重启kettle

4.配置完成后,即可连接上对应的库

如果要使用hadoop-20,则需要添加如下jar包:

hadoop-core-1.2.1.jar

hive-common-0.13.0.jar

hive-jdbc-0.13.0.jar

hive-service-0.13.0.jar

libthrift-0.9.1.jar

slf4j-api-1.7.5.jar

httpclient-4.2.5.jar

httpcore-4.2.5.jar


欢迎分享,转载请注明来源:内存溢出

原文地址: https://outofmemory.cn/bake/8011341.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2023-04-12
下一篇 2023-04-12

发表评论

登录后才能评论

评论列表(0条)

保存