2018-04-03

2018-04-03  本文已影响0人  瑞士神塔卡佩拉

写hql过程中遇到count(distinct)问题,之后通过max(1)解决

代码如下

'''

SELECT

  day,

  COUNT(1) AS devices_count

FROM

  (

    SELECT

      UPPER(device_id) AS device_id,

      dt AS day,

      MAX(1)

    FROM

      ks_device.device_new_extend_active_base_std_dt

    WHERE

      (dt BETWEEN '{start_day:%Y-%m-%d}' AND '{end_day:%Y-%m-%d}')

    GROUP BY

      UPPER(device_id),

      dt

  ) AS meow

GROUP BY

  day

'''

关于hive大数据倾斜的总结

漫谈千亿级数据优化实践

上一篇 下一篇

猜你喜欢

热点阅读