【Hive问题 1】Hive 类型异常问题排查

2022-08-25  本文已影响0人  liuhensiyuhua

最近在执行Hive SQL的时候遇到一个类型转换异常的错误,无法将Text类型转换为Long。由于异常信息栈比较多,只贴出一些核心异常信息,具体如下:

Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.LongWritable
    at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableLongObjectInspector.get(WritableLongObjectInspector.java:36)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPLessThan.evaluate(GenericUDFOPLessThan.java:130)
    at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
    at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
    at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:65)
    at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
    at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
    at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:93)
    at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
    at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
    at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:65)
    at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
    at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
    at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorHead._evaluate(ExprNodeEvaluatorHead.java:44)
    at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
    at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
    at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:112)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
    at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)  

其中查询语句如下:

   A1 long,
   ...
},
B AS {
  B1,        //string类型
  B2,       //long类型
  ...
},
C AS {
  C1,       //string类型
  C2,      //long类型
  ...
}
SELECT
A1,
IF(duration/1000 > xxx,1,0) as valid,
...
FROM
(SELECT 
    B1 AS id,    //long类型
    B2 AS duration,  //long类型
    ...
  FROM
    B
 UNION 
 SELECT 
    C1 AS id,    //long类型
    C2 AS duration,  //long类型
    ...
  FROM
    C
) temp
LEFT JOIN
A   ON A.A1 = temp.id

实际上,其中有两个问题:
1.join的关键key类型不匹配,一个是long,一个是string
2.在进行数据计算的时候,会有long类型参与计算转换

优化结果:
1.数据类型一致之后进行join,任务的执行时间由1h减少到0.5h
2.long类型的转换报错,因为有null类型的数据,需要任务的执行从null类型转换变为long类型,就导致报异常错误。

结论:
一般需要检查字段的类型以及字段中的value是否有异常值,做比较充分的数据探查工作。

上一篇下一篇

猜你喜欢

热点阅读