【2019-01-03】spark2.1使用yarn-clust

2019-01-28 本文已影响0人学师大术

背景

信息

版本	组件	一句话描述
2.1.0	spark	sparksession初始化在线程中的spark程序，在yarn-client正常运行，yarn-cluster运行异常

核心代码

/**在线程中启动一个sparksession**/
new Thread(new myJob).start()
/**myJob是个线程，启动sparksession**/
class myJob extends Runnable{
 override def run(): Unit = {
   val sparkSession = SparkSession.builder().enableHiveSupport().getOrCreate()
   val txtrdd= sparkSession.sql("show tables");
   txtrdd.show();
   sparkSession.stop();
 }
}

异常信息
1.AM日志无异常

2019-01-28 10:32:41,500 | INFO  | [dispatcher-event-loop-4] | OutputCommitCoordinator stopped! | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-01-28 10:32:41,510 | INFO  | [Driver] | Successfully stopped SparkContext | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-01-28 10:32:41,512 | INFO  | [Driver] | Final app status: SUCCEEDED, exitCode: 0 | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2019-01-28 10:32:41,526 | INFO  | [pool-1-thread-1] | Unregistering ApplicationMaster with SUCCEEDED | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)

2.任务状态失败,异常提示为找不到_spark_conf.zip

Application application_1548509021440_0009 failed 2 times due to AM Container for appattempt_1548509021440_0009_000002 exited with exitCode: -1000
For more detailed output, check the application tracking page:https://172-16-56-32:26001/cluster/app/application_1548509021440_0009 Then click on links to logs of each attempt.
Diagnostics: File does not exist: hdfs://hacluster/user/zzltest/.sparkStaging/application_1548509021440_0009/__spark_conf__.zip
java.io.FileNotFoundException: File does not exist: hdfs://hacluster/user/zzltest/.sparkStaging/application_1548509021440_0009/__spark_conf__.zip
at org.apache.hadoop.hdfs.DistributedFileSystem$28.doCall(DistributedFileSystem.java:1529)
at org.apache.hadoop.hdfs.DistributedFileSystem$28.doCall(DistributedFileSystem.java:1521)

分析

版本约束? yarn-cluster下不能够在thread中提交sparksession。
其实按理不建议在线程初始化sparkcontext，否则多个sparkcontext存在一个jvm肯定不合理，虽然我们可以尝试开启多spark-context（spark.driver.allowMultipleContexts）的配置。但是yarn-client和yarn-cluster不同的行为模式还是让人疑惑。
anyway，分析下是什么原因导致这两个差异所在？
分析hdfs的审计日志，__spark__conf.zip这个包是存在的.
对比失败的任务，发现spark都没有初始化。可能还是要找下yarn-client和yarn-cluster的区别。

根因

用户在线程中初始化sparkcontext，并且开启了多线程，导致一个jvm中存在多个sparkcontext，从而引起了上述错误。
看来sparkcontext不能在线程中初始化，至少FI版本中是这样的。后续找个cdh版本再看下。

【2019-01-03】spark2.1使用yarn-clust

背景

分析

根因

猜你喜欢

热点阅读