大数据

构建并生成日期维度数据表

2020-07-02  本文已影响0人  lei_charles
  1. 基于 MySQL 生成日期维度数据
    1. 创建日期维度数据表
      DROP TABLE IF EXISTS `date_dim`;
      CREATE TABLE IF NOT EXISTS `date_dim` (
          `date_sk` BIGINT ( 20 ) NOT NULL AUTO_INCREMENT,
          `date` VARCHAR ( 255 ) DEFAULT NULL,
          `month` VARCHAR ( 255 ) DEFAULT NULL,
          `month_name` VARCHAR ( 255 ) DEFAULT NULL,
          `quarter` VARCHAR ( 255 ) DEFAULT NULL,
          `year` VARCHAR ( 255 ) DEFAULT NULL,
          PRIMARY KEY ( `date_sk` ) USING BTREE 
      ) ENGINE = INNODB DEFAULT CHARSET = utf8;
      
    2. 建立日期维度数据生成的存储过程
      -- 建立日期维度数据生成的存储过程
      delimiter //
      DROP PROCEDURE IF EXISTS pre_populate_date // 
      CREATE PROCEDURE pre_populate_date(IN start_dt date, IN end_dt date) 
      BEGIN
        WHILE start_dt <= end_dt DO
            INSERT INTO date_dim (date_sk, date, month, month_name, quarter, year)
            VALUES
            ( NULL, start_dt, MONTH(start_dt), monthname(start_dt), QUARTER(start_dt), YEAR (start_dt));
            SET start_dt = adddate( start_dt, 1 );
        END WHILE;
      COMMIT;
      END 
      // 
      delimiter;
      
    3. 生成日期维度数据
      -- 生成日期维度数据
      SET FOREIGN_KEY_CHECKS = 0;
      TRUNCATE TABLE date_dim;
      CALL pre_populate_date ( '2000-01-01', '2020-12-31' );
      SET FOREIGN_KEY_CHECKS = 1;
      
    4. 查看生成数据
      mariadb> select * from date_dim order by date_sk desc limit 10;
      +---------+------------+-------+------------+---------+------+
      | date_sk | date       | month | month_name | quarter | year |
      +---------+------------+-------+------------+---------+------+
      |    7671 | 2020-12-31 | 12    | December   | 4       | 2020 |
      |    7670 | 2020-12-30 | 12    | December   | 4       | 2020 |
      |    7669 | 2020-12-29 | 12    | December   | 4       | 2020 |
      |    7668 | 2020-12-28 | 12    | December   | 4       | 2020 |
      |    7667 | 2020-12-27 | 12    | December   | 4       | 2020 |
      |    7666 | 2020-12-26 | 12    | December   | 4       | 2020 |
      |    7665 | 2020-12-25 | 12    | December   | 4       | 2020 |
      |    7664 | 2020-12-24 | 12    | December   | 4       | 2020 |
      |    7663 | 2020-12-23 | 12    | December   | 4       | 2020 |
      |    7662 | 2020-12-22 | 12    | December   | 4       | 2020 |
      +---------+------------+-------+------------+---------+------+
      10 rows in set (0.05 sec)
      mariadb> 
      
  2. 基于 shell 脚本生成日期维度数据
    1. 创建 date_dim_generate.sh 脚本文件,内容如下:

      #!/bin/bash
      date1="$1"
      date2="$2"
      
      tempdate=`date -d "$date1" +%F`
      tempdateSec=`date -d "$date1" +%s`
      enddateSec=`date -d "$date2" +%s`
      min=1
      max=`expr \( $enddateSec - $tempdateSec \) / \( 24 \* 60 \* 60 \) + 1`
      cat /dev/null > ./date_dim.csv
      
      
      while [ $min -le $max ]
      do
        month=`date -d "$tempdate" +%m`
        month_name=`date -d "$tempdate" +%B`
        quarter=`echo $month | awk '{print int(($0-1)/3)+1}'`
        year=`date -d "$tempdate" +%Y`
        echo ${tempdate}","${month}","${month_name}","${quarter}","${year}>> ./date_dim.csv
        tempdate=`date -d "$min day $date1" +%F`
        tempdateSec=`date -d "$min day $date1" +%s`
        min=`expr $min + 1`
      done
      
    2. 执行脚本,生成数据

      [root@node01 ~]# bash date_dim_generate.sh 2000-01-01 2020-12-31
      [root@node01 ~]# cat date_dim.csv
      2000-01-01,01,January,1,2000
      2000-01-02,01,January,1,2000
      2000-01-03,01,January,1,2000
      2000-01-04,01,January,1,2000
      2000-01-05,01,January,1,2000
      ......
      2020-12-27,12,December,4,2020
      2020-12-28,12,December,4,2020
      2020-12-29,12,December,4,2020
      2020-12-30,12,December,4,2020
      2020-12-31,12,December,4,2020
      
上一篇下一篇

猜你喜欢

热点阅读