Standalone模式Yarn模式
driver是进程吗?
先说结论:
1.Standalone模式中,client是spark-submit进程中开启一个线程,然后通过反射执行driver代码的main方法。cluster是开启DriverWrapper进程来运行driver。
2.Yarn模式,client是spark-submit进程中开启一个线程,然后通过反射执行driver代码的main方法。
cluster是ApplicationMaster进程中通过反射执行driver代码的main方法。
client模式:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://hadoop000:7077 --deploy-mode client ./examples/jars/spark-examples_2.11-2.4.2.jar 1000
运行的进程:
[hadoop@hadoop000 ~]$ jps 16610 CoarseGrainedExecutorBackend 15156 Worker 15062 Master 16551 SparkSubmit 16713 Jps
cluster模式:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://hadoop000:7077 --deploy-mode cluster ./examples/jars/spark-examples_2.11-2.4.2.jar 1000 启动的进程:
开始:
[hadoop@hadoop000 ~]$ jps 16416 CoarseGrainedExecutorBackend 15156 Worker 16309 SparkSubmit 15062 Master 16348 DriverWrapper 16476 Jps
几秒后,SparkSubmit会退出,shell面板没有运行日志:
[hadoop@hadoop000 ~]$ jps 16209 CoarseGrainedExecutorBackend 15156 Worker 16276 Jps 15062 Master 16141 DriverWrapperYarn模式
client模式:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client ./examples/jars/spark-examples_2.11-2.4.3.jar 1000
运行的进程:
[hadoop@hadoop000 ~]$ jps 18740 ExecutorLauncher 16949 ResourceManager 17061 NodeManager 17813 SecondaryNameNode 18021 SparkSubmit 18917 CoarseGrainedExecutorBackend 17640 DataNode 17500 NameNode 18940 Jps 18846 CoarseGrainedExecutorBackend
cluster模式:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster ./examples/jars/spark-examples_2.11-2.4.3.jar 1000
运行的进程,shell面板只有完成状态的日志,没有运行结果:
[hadoop@hadoop000 ~]$ jps 21041 Jps 16949 ResourceManager 17061 NodeManager 17813 SecondaryNameNode 17640 DataNode 20777 ApplicationMaster 20026 SparkSubmit 20923 CoarseGrainedExecutorBackend 17500 NameNode 21006 CoarseGrainedExecutorBackend
详情源码下面两篇博客讲的很好:
【Spark】部署流程的深度了解
Spark源码 —— 从 SparkSubmit 到 Driver启动
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)