Flink on Yarn实战

Flink on Yarn实战,第1张

Flink on Yarn实战

  使用方式:flink的安装包解压之后,即可直接使用,而不需要额外的配置。 参考:flink部署说明文档 

 01 几点结论
  • 1.yarn-session的方式,只能在运行了 yarn-session.sh -d 的机器上,才能通过命令行提交flink作业,因为flink run的时候需要根据 /tmp/.yarn-properties-appuser 这个文件的内容找到session
    • 使用 echo "stop" | ./bin/yarn-session.sh -id application_1609324396857_95667 可以优雅的停掉session,并且删除/tmp/.yarn-properties-appuser
    • 如果是yarn application -kill application_1609324396857_95667 的话,那么/tmp/.yarn-properties-appuser 会保留
  • 2.如果要在其他机器也能提交作业,那么可以把/tmp/.yarn-properties-appuser这个文件拷贝一份该机器上
  • 3.当然,也可以在flink的界面上submit的方式提交。

02 使用yarn-session 
#  新版本的yarn会按需动态分配TaskManager和slot,其实-n -s参数已经失效
yarn-session.sh -d -jm 1024 -tm 1024 -nm flinktest 

使用 yarn-session.sh 命令在102的机器上启动之后的日志情况

查看102机器上  /tmp/.yarn-properties-appuser 文件的内容

 在101机器上,不存在 /tmp/.yarn-properties-appuser 文件,则提交任务的时候报错

[appuser@dxbigdata101 flink-1.12.0]$ cat  /tmp/.yarn-properties-appuser
cat: /tmp/.yarn-properties-appuser: No such file or directory
[appuser@dxbigdata101 flink-1.12.0]$ flink run ./examples/batch/WordCount.jar           
Setting Hbase_CONF_DIR=/etc/hbase/conf because no Hbase_CONF_DIR was set.
Executing WordCount example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.

------------------------------------------------------------
 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
        at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:330)
        at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198)
        at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114)
        at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:743)
        at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:242)
        at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:971)
        at org.apache.flink.client.cli.CliFrontend.lambda$main(CliFrontend.java:1047)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
        at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1047)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
        at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:309)
        at org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:988)
        at org.apache.flink.client.program.ContextEnvironment.executeAsync(ContextEnvironment.java:124)
        at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:72)
        at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:875)
        at org.apache.flink.api.java.DataSet.collect(DataSet.java:413)
        at org.apache.flink.api.java.DataSet.print(DataSet.java:1652)
        at org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:96)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:316)
        ... 11 more
Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
        at org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:983)
        ... 22 more
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
        at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob(RestClusterClient.java:366)
        at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
        at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay(FutureUtils.java:375)
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest(RestClient.java:342)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:337)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.run(SingleThreadEventExecutor.java:989)
        at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap.run(ThreadExecutorMap.java:74)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted.
        at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay(FutureUtils.java:372)
        ... 21 more
Caused by: java.util.concurrent.CompletionException: org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8081
        at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
        at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
        at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:943)
        at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
        ... 19 more
Caused by: org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8081
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.run(SingleThreadEventExecutor.java:989)
        at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap.run(ThreadExecutorMap.java:74)
        at java.lang.Thread.run(Thread.java:748)

在机器103上,存在 /tmp/.yarn-properties-appuser 但是内容是已经被kill掉的application,提交任务也报错

03 使用yarn-cluster

直接使用命令提交即可,比如

flink run -m yarn-cluster -c cn.fancychuan.scala.quickstart.DataStreamWcApp /home/appuser/forlearn/flink/flink-1.0-SNAPSHOT.jar --host hadoop101 --port 7777

注意:相比于yarn-session模式,提价命令多了 -m yarn-cluster

 但是这种方式提交是不能通过web界面submit新的任务的

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/4966150.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-11-14
下一篇 2022-11-13

发表评论

登录后才能评论

评论列表(0条)

保存