clickhouse-物化视图

2022-06-09  本文已影响0人  越狱的灵感

参考文档

https://clickhouse.tech/docs/en/sql-reference/statements/create/view/#

介绍

物化视图可以理解为一个预聚合触发器,数据在控制好触发的汇聚条件,几乎是实时的
物化视图会存储一份计算好的聚合数据,是一种空间换时间的绝妙方法,对集群的稳定性和很重要。


1.png

使用

物化视图的建立有两种方法
1,使用TO关键字(推荐使用),可以控制TTL,不能使用POPULATE
例:

--创建本地存储表,物化视图本地表和普通的建表没啥区别,inner_物化视图本地表前缀
create table IF NOT EXISTS demo_db.inner_demo_table on CLUSTER demo0201oddp_v1 (
    `day` date default toDate(now())
    ,`its` datetime default now()
    ,`channel` String
    ,`appName` String
    ,`stream` String
    ,`requestUrl` String
    ) ENGINE = ReplicatedReplacingMergeTree (
    '/clickhouse/tables/{cluster_demo0201oddp_v1}/{shard_demo0201oddp_v1}/inner_demo_table'
    ,'{replica_demo0201oddp_v1}'
    ) partition by toYYYYMMDD (its)
order by (channel,appName,stream)  TTL its + toIntervalDay(20) SETTINGS index_granularity = 8192,storage_policy = 'demo_default_policy'

--创建物化视图,mview_ 物化视图前缀,to 后面接本地表,as 后面添加计算逻辑
create MATERIALIZED view demo_db.mview_demo_table on CLUSTER demo0201oddp_v1 to demo_db.inner_demo_table as
select channel
    ,appName
    ,stream
    ,requestUrl
from demo_db.inner_demo_table
group by channel
    ,appName
    ,stream
    ,requestUrl

2,使用默认表
此方案建议是数据量小的表,因为无法控制TTL,后期数据运维不方便。默认存储表在clickhouse中是.inner_id.uuid值作为表名
例:

--物化视图原始数据表
create table IF NOT EXISTS demo_db.inner_demo_table ON CLUSTER demo0201oddp_v1 (
  `day` date default toDate(now()),
  `its` datetime default now(),
  `_cnt` UInt8 default 1,
  `channel` String,
  `eventTime` UInt32,
  `clientId` UInt32,
  `scheduleStr` String,
  `ispCode` UInt32,
  `hitFlag` Int16,
  `reqCnt` UInt32,
  `fileSize` UInt64,
  `clientIp` String,
  `guid` String,
  `status` String
) ENGINE = ReplicatedMergeTree (
  '/clickhouse/tables/{cluster1}/{shard}/inner_demo_table',
  '{replica}'
) partition by toYYYYMMDD (toDate(eventTime))
order by
  its TTL its + INTERVAL 1 hour SETTINGS index_granularity = 8192

--使用默认存储表
CREATE MATERIALIZED VIEW IF NOT EXISTS demo_db.mview_inner_demo_table_uv ON CLUSTER demo0201oddp_v1 ENGINE = ReplicatedAggregatingMergeTree(
  '/clickhouse/tables/{cluster1}/{shard}/mview_inner_demo_table_uv',
  '{replica}'
) PARTITION BY t_day
ORDER BY
  (channel) SETTINGS index_granularity = 8192,
  use_minimalistic_part_header_in_zookeeper = 1 AS
SELECT
  sumState(1) as uv,
  day as t_day,
  toStartOfHour(toDateTime(eventTime)) as __time,
  channel,
  clientIp
FROM
  demo_db.inner_demo_table
GROUP BY
  channel,
  clientIp,
  day,
  __time;

总结

1,物化视图是一种空间换时间的预聚合方式,聚合后的数据将存储在新表中,一般于SummingMergeTree,AggregatingMergeTree等聚合引擎一起使用。

2,物化视图因为是写入触发器,所以as select只对每批次的insert data有效果,所以即使是where条件也是对这批写入数据起效果(https://clickhouse.tech/docs/en/sql-reference/statements/create/view/#materialized

2.png
3,物化视图只有在原表insert的时候才会触发

4,POPULATE关键字,不建议使用,会把原始表中的已存在数据全部物化一遍,老数据的同步,建议直接insert到mv中

5,多表join生成物化视图,左表插入数据时才更新

6,源表数据的改变不会影响物化视图,如update, delete, drop partition

上一篇下一篇

猜你喜欢

热点阅读