Hive/Sql

SQL项目实战专题二:提取会员主要的数据指标字段

2019-12-27  本文已影响0人  对三zzzzzzz

一、业务背景介绍:

1、数据源说明:

项目的实战背景是二家超市的真实数据,数据时间段是从 20160101 到 20171231。

2、商业场景:

二、思考以下问题:

1、 怎么基于会员的购买次数来确定会员的活跃度?
2、 怎么基于会员最后一单距离当前时间确定会员的活跃度?

三、需求:

四、逻辑分析:

1、 会员数据从哪些表取:

用到的SQL 知识点:

五、代码实现过程:

分开跑(性能更优):
1.从订单表Orderlist中取会员ID、首单时间、末单时间、末单距20171201的时间间隔、累计购买次数、累计购买金额.

SELECT MemberID
      ,MIN(SDate) AS first_order
      ,MAX(SDate) AS last_order
      ,TIMESTAMPDIFF(DAY,MAX(SDate),"20171201") AS day_interval
      ,COUNT(DISTINCT SheetID) AS count_sheet
      ,SUM(CashValue) AS sum_value
FROM OrderList
WHERE MemberID IS NOT NULL
GROUP BY MemberID;
1

2.从订单编辑表OrderItem中取累计购买商品数。因为OrderItem表没有会员ID,所以需要跟OrderList做内连接,关联字段为SheetID

SELECT ol.MemberID
      ,SUM(DISTINCT oi.GoodsID) AS SUM_goods
FROM OrderItem AS oi
INNER JOIN OrderList AS ol ON oi.SheetID = ol.SheetID
WHERE ol.MemberID IS NOT NULL
GROUP BY ol.MemberID;
2

3.从会员表MemberInfo取会员注册时间

SELECT MemberID
      ,res_time
FROM MemberInfo;
3

一起跑:

SELECT mi.MemberID AS 会员ID
      ,mi.res_time AS 注册时间
      ,MIN(ol.SDate) AS 首单时间
      ,MAX(ol.SDate) AS 最后购买时间
      ,DATEDIFF('20171201',MAX(ol.SDate)) AS 最后一单距离时间
      ,COUNT(DISTINCT ol.SheetID) AS 累计购买次数
      ,COUNT(oi.Qty) AS 累计购买商品数
      ,SUM(oi.SaleValue) AS 累计购买金额
FROM OrderList AS ol
INNER JOIN OrderItem AS oi ON ol.SheetID = oi.SheetID
INNER JOIN MemberInfo AS mi ON mi.MemberID = ol.MemberID
WHERE mi.MemberID IS NOT NULL
GROUP BY mi.MemberID
        ,mi.res_time;
查询结果
上一篇下一篇

猜你喜欢

热点阅读