sparksql读parquet表执行报错

2020-06-19  本文已影响0人  liuzx32

集群内存:1024G(数据量:400G)

报错信息:

Job aborted due to stage failure: Serialized task 2231:2304 was 637417604 bytes, which exceeds max allowed: spark.rpc.message.maxSize (134217728 bytes). Consider increasing spark.rpc.message.maxSize or using broadcast variables for large values.

原因:

Driver端发送的数据太大导致超过spark默认的传输限制

解决方案:

增加配置信息 spark.rpc.message.maxSize=1024

spark2-submit \
--class com.lhx.test \
--master yarn \
--deploy-mode cluster \
--conf spark.rpc.message.maxSize=1024 \
--driver-memory 30g \
--executor-memory 12g \
--num-executors 12 \
--executor-cores 3 \
--conf spark.yarn.driver.memoryOverhead=4096m \
--conf spark.yarn.executor.memoryOverhead=4096m \
./test.jar
上一篇 下一篇

猜你喜欢

热点阅读