2021-03-27 Spark任务提交方式之SparkLaun
2021-03-27 本文已影响0人
JIUJIANGZ
最近面试被问到提交任务到集群运行的方式,一下子没想起来我们项目中用的是SparkLaunch来提交的,而这一块是直接从别的项目中拿过来的,自己没有研究过着一块(果然偷懒还是不行的o(╥﹏╥)o),今天还是来补一下。
对于SparkLaunch官网是这么描述的:
“Use this class to start Spark applications programmatically. The class uses a builder pattern to allow clients to configure the Spark application and launch it as a child process.“,
意思就是SparkLaunch这个类是用于客户端配置Spark应用程序并将其作为子进程启动。
而我们的项目中使用到的就是这种形式。
SparkAppHandle handler = new SparkLauncher().setAppName("CHECK_RUN_" + taskRunId)
.setAppResource("aml-shade-1.0.0.jar").setMainClass("com.check.CheckExecute")
.setMaster("yarn").setDeployMode("cluster")
.addFile(jdbcProperty)
.startApplication(new SparkAppHandle.Listener() {
@Override
public void stateChanged(SparkAppHandle handle) {
// TODO state changed 状态变化时处理
System.out.println("********** state changed **********");
}
@Override
public void infoChanged(SparkAppHandle handle) {
// TODO info changed
System.out.println("********** info changed **********");
}
}
像Spark-Submit一样,这里也可以设置一些参数:有一些参数是必须设置的,比如:setAppResource、setAppResource、setMaster、setDeployMode..
具体可以查看官网的介绍SparkLaunch参数详情
启动之后 会监听Spark进程的状态,状态信息保存在enum State
enum State {
/** The application has not reported back yet. */
UNKNOWN(false),
/** The application has connected to the handle. */
CONNECTED(false),
/** The application has been submitted to the cluster. */
SUBMITTED(false),
/** The application is running. */
RUNNING(false),
/** The application finished with a successful status. */
FINISHED(true),
/** The application finished with a failed status. */
FAILED(true),
/** The application was killed. */
KILLED(true),
/** The Spark Submit JVM exited with a unknown status. */
LOST(true);