spark udf 提示not serializable
2020-06-15 本文已影响0人
南修子
20/06/08 16:41:06 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 327.2 KB, free 912.0 MB)
20/06/08 16:41:06 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 30.1 KB, free 912.0 MB)
20/06/08 16:41:06 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.42.76:35893 (size: 30.1 KB, free: 912.3 MB)
20/06/08 16:41:06 INFO spark.SparkContext: Created broadcast 0 from checkpoint at DataProcessingNew.java:323
20/06/08 16:41:07 INFO codegen.CodeGenerator: Code generated in 351.641059 ms
20/06/08 16:41:07 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Task not serializable
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2094)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:840)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:839)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.mapPartitionsWithIndex(RDD.scala:839)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:371)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
at org.apache.spark.sql.Dataset.checkpoint(Dataset.scala:512)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:646)
Caused by: java.io.NotSerializableException: javax.script.ScriptEngineManager
Serialization stack:
- object not serializable (class: javax.script.ScriptEngineManager, value: javax.script.ScriptEngineManager@78aa31f2)
- field (class: org.apache.spark.sql.UDFRegistration$$anonfun$register$26, name: f$21, type: interface org.apache.spark.sql.api.java.UDF2)
- object (class org.apache.spark.sql.UDFRegistration$$anonfun$register$26, <function1>)
- field (class: org.apache.spark.sql.UDFRegistration$$anonfun$register$26$$anonfun$apply$2, name: $outer, type: class org.apache.spark.sql.UDFRegistration$$anonfun$register$26)
- object (class org.apache.spark.sql.UDFRegistration$$anonfun$register$26$$anonfun$apply$2, <function2>)
- field (class: org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$3, name: func$3, type: interface scala.Function2)
- object (class org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$3, <function1>)
- field (class: org.apache.spark.sql.catalyst.expressions.ScalaUDF, name: f, type: interface scala.Function1)
- object (class org.apache.spark.sql.catalyst.expressions.ScalaUDF, UDF(input[2, double, true], 3*x+2))
- element of array (index: 0)
- array (class [Ljava.lang.Object;, size 2)
- field (class: org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8, name: references$1, type: class [Ljava.lang.Object;)
- object (class org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8, <function2>)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
... 26 more
20/06/08 16:41:07 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Task not serializable)
20/06/08 16:41:07 INFO spark.SparkContext: Invoking stop() from shutdown hook
继承序列化类即可