clickhouse 21.x生产实践优化

2022-01-23  本文已影响0人  架构师老狼
clickhouse
1 时间字段类型
2 Nullable类型
3 分区与索引
4 数据TTL
5 写入与删除优化
6 谓词下推:(各个版本性能损失有差异)
7 Prewhere替代where
8 类关系型数据库要求
9 避免构建虚拟列
SELECT Income,Age,Income/Age as IncRate FROM datasets.hits_v1;
10 uniqCombined 替代distinct
反例:select count(distinct rand()) from hits_v1;

正例:SELECT uniqCombined(rand()) from datasets.hits_v1
11 用IN代替JOIN
12 大小表JOIN(mysql、oracle通用要求)
13 分布式表使用GLOBAL
14 数据一致性
explain pipeline select * from visits_v1 final WHERE StartDate = '2014-03-17' limit 100 settings max_final_threads = 2;

(Expression)

ExpressionTransform × 2

(SettingQuotaAndLimits)

(Limit)

Limit 2 → 2

(ReadFromMergeTree)

ExpressionTransform × 2

CollapsingSortedTransform × 2

Copy 1 → 2

AddingSelector

ExpressionTransform

MergeTree 0 → 1

从 CollapsingSortedTransform 这一步开始已经是多线程执行,但是读取 part 部分的动作还是串行
15 物化视图:
1)定义:
2)优缺点
3)物化视图实战
#建表语句

CREATE TABLE hits_test(

EventDate Date,

CounterID UInt32,

UserID UInt64,

URL String,

Income UInt8

)ENGINE = MergeTree()

PARTITION BY toYYYYMM(EventDate)

ORDER BY (CounterID, EventDate, intHash32(UserID))

SAMPLE BY intHash32(UserID)

SETTINGS index_granularity = 8192

#导入数据

INSERT INTO hits_test

SELECT EventDate,CounterID,UserID,URL,Income FROM hits_v1 limit 10000;

#创建物化视图
CREATE MATERIALIZED VIEW hits_mv

ENGINE=SummingMergeTree

PARTITION BY toYYYYMM(EventDate) ORDER BY (EventDate, intHash32(UserID))

AS SELECT UserID,EventDate,count(URL) as ClickCount,sum(Income) AS IncomeSum FROM hits_test

WHERE EventDate >= '2014-03-20' #设置更新点,该时间点之前的数据可以另外通过insert into select …… 的方式进行插入 

GROUP BY UserID,EventDate;

#或者可以用下列语法,表 A 可以是一张 mergetree 表

CREATE MATERIALIZED VIEW 物化视图名 TO 表 A AS SELECT FROM 表 B;

#不建议添加 populate 关键字进行全量更新

#导入增量数据

INSERT INTO hits_test

SELECT EventDate,CounterID,UserID,URL,Income FROM hits_v1 WHERE EventDate >= '2014-03-23' limit 10;

#查询物化视图

SELECT * FROM hits_mv;

#导入历史数据

INSERT INTO hits_mv

SELECT UserID,EventDate,count(URL) as ClickCount, sum(Income) AS IncomeSum FROM hits_test

WHERE EventDate = '2014-03-19' GROUP BY UserID,EventDate
上一篇 下一篇

猜你喜欢

热点阅读