使用可插拔方式,解决clickhouse启动慢生产实践

2021-03-05  本文已影响0人  霍光炫
自2019年我们团队一直尝试探索使用clickhouse生态建造公司收单行结算为核心数据仓库解决方案。
在实际生产过程面临诸多问题,近2年探索之路十分艰辛,积累一些关于clickhouse运维技巧。
前期实际生产中遇到各类原因导致19.x系列clickhouse OOM不少案例,重启clickhouse时间很长。在多次与官方人员沟通后,我们团队摸索出来一套clickhouse可插拔重启策略以提高业务响应能和故障紧急处理方案。
在接下来篇幅中分享给大家。

公司一个生产数据仓库测试实例7张核心为例:

核心表信息
SELECT
    table,
    formatReadableSize(sum(bytes_on_disk)) AS table_size,
    sum(bytes_on_disk) AS table_size_bytes
FROM system.parts
WHERE database = 'explame'
GROUP BY
    database,
    table
ORDER BY table_size_bytes DESC
┌─table─┬─table_size─┬─table_size_bytes─┐
│ test2 │ 105.96 GiB │     113778726932 │
│ test3 │ 78.60 GiB  │      84391901327 │
│ test5 │ 67.66 GiB  │      72650507060 │
│ test6 │ 64.45 GiB  │      69200180213 │
│ test7 │ 53.96 GiB  │      57937061973 │
│ test4 │ 51.78 GiB  │      55601596221 │
│ test1 │ 314.14 MiB │        329401214 │
└───────┴────────────┴──────────────────┘
7 rows in set. Elapsed: 82.423 sec. Processed 87.03 thousand rows, 40.52 MB (1.06 thousand rows/s., 491.58 KB/s.)


# items 对应clickhoue 重启日志中
# explame.test1: Loaded data parts (142 items)
SELECT
    table,
    count(1) AS items
FROM system.parts
WHERE database = 'explame'
GROUP BY table
ORDER BY items DESC
┌─table─┬─items─┐
│ test5 │ 61645 │
│ test3 │ 15602 │
│ test4 │  8778 │
│ test2 │   335 │
│ test7 │   262 │
│ test6 │   262 │
│ test1 │   142 │
└───────┴───────┘
7 rows in set. Elapsed: 58.512 sec. Processed 87.03 thousand rows, 40.52 MB (1.49 thousand rows/s., 692.47 KB/s.

实例正常启动流程日志

2021.03.05 05:58:36.527990 [ 58 ] {} <Information> Application: Received termination signal (Terminated)
2021.03.05 05:58:36.528735 [ 1 ] {} <Debug> Application: Received termination signal.
2021.03.05 05:58:36.528892 [ 1 ] {} <Debug> Application: Waiting for current connections to close.
2021.03.05 05:58:37.003978 [ 1 ] {} <Information> Application: Closed all listening sockets. Waiting for 1 outstanding connections.
2021.03.05 05:58:37.975104 [ 45 ] {} <Information> TCPHandler: Done processing connection.
2021.03.05 05:58:38.005287 [ 1 ] {} <Information> Application: Closed connections.
2021.03.05 05:58:38.008790 [ 1 ] {} <Information> Application: Shutting down storages.
2021.03.05 05:59:01.878235 [ 1 ] {} <Trace> BackgroundSchedulePool: Waiting for threads to finish.
2021.03.05 05:59:01.905738 [ 1 ] {} <Debug> Application: Shutted down storages.
2021.03.05 05:59:01.947070 [ 1 ] {} <Debug> Application: Destroyed global context.
2021.03.05 05:59:01.959692 [ 1 ] {} <Information> Application: shutting down
2021.03.05 05:59:01.959915 [ 1 ] {} <Debug> Application: Uninitializing subsystem: Logging Subsystem
2021.03.05 05:59:01.960094 [ 58 ] {} <Information> BaseDaemon: Stop SignalListener thread
2021.03.05 05:59:02.374089 [ 1 ] {} <Information> : Starting ClickHouse 19.16.3.6 with revision 54427
2021.03.05 05:59:02.377235 [ 1 ] {} <Information> Application: starting up
2021.03.05 05:59:02.384179 [ 1 ] {} <Debug> Application: rlimit on number of file descriptors is 500000
2021.03.05 05:59:02.384383 [ 1 ] {} <Debug> Application: Initializing DateLUT.
2021.03.05 05:59:02.384487 [ 1 ] {} <Trace> Application: Initialized DateLUT with time zone 'Etc/UTC'.
2021.03.05 05:59:02.393131 [ 1 ] {} <Debug> ConfigReloader: Loading config '/etc/clickhouse-server/users.xml'
2021.03.05 05:59:02.397167 [ 1 ] {} <Information> Application: Loading metadata from /var/lib/clickhouse/
2021.03.05 05:59:02.398544 [ 1 ] {} <Information> DatabaseOrdinary (system): Total 4 tables.
==========================================================
# 这里要注意一下,重启过程每个表只会有一个进程在处理它的相关事项
# 所以表数量多,会影响到重启速度
# 相关说明见BackgroundProcessingPool小结
=========================================================
2021.03.05 05:59:02.409504 [ 5 ] {} <Information> BackgroundProcessingPool: Create BackgroundProcessingPool with 16 threads 
2021.03.05 05:59:02.412927 [ 5 ] {} <Debug> system.trace_log: Loading data parts
2021.03.05 05:59:02.412930 [ 3 ] {} <Debug> system.query_thread_log: Loading data parts
2021.03.05 05:59:02.412936 [ 4 ] {} <Debug> system.query_log: Loading data parts
2021.03.05 05:59:02.412965 [ 2 ] {} <Debug> system.part_log: Loading data parts
2021.03.05 05:59:02.644429 [ 5 ] {} <Debug> system.trace_log: Loaded data parts (49 items)
2021.03.05 05:59:02.649192 [ 2 ] {} <Debug> system.part_log: Loaded data parts (41 items)
2021.03.05 05:59:02.662583 [ 3 ] {} <Debug> system.query_thread_log: Loaded data parts (50 items)
2021.03.05 05:59:02.689327 [ 4 ] {} <Debug> system.query_log: Loaded data parts (81 items)
2021.03.05 05:59:02.689932 [ 1 ] {} <Information> DatabaseOrdinary (system): Starting up tables.
2021.03.05 05:59:02.690159 [ 3 ] {} <Trace> system.query_log: Found 7 old parts to remove.
2021.03.05 05:59:02.690326 [ 3 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262499_3262620_27
2021.03.05 05:59:02.690165 [ 2 ] {} <Trace> system.query_thread_log: Found 8 old parts to remove.
2021.03.05 05:59:02.690606 [ 2 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3254895_3255009_24
2021.03.05 05:59:02.692531 [ 2 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3254895_3255014_25
2021.03.05 05:59:02.692532 [ 3 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262499_3262625_28
2021.03.05 05:59:02.694334 [ 2 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255010_3255010_0
2021.03.05 05:59:02.694713 [ 3 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262621_3262621_0
2021.03.05 05:59:02.696246 [ 2 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255011_3255011_0
2021.03.05 05:59:02.696882 [ 3 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262622_3262622_0
2021.03.05 05:59:02.698138 [ 2 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255012_3255012_0
2021.03.05 05:59:02.699004 [ 3 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262623_3262623_0
2021.03.05 05:59:02.699945 [ 2 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255013_3255013_0
2021.03.05 05:59:02.701170 [ 3 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262624_3262624_0
2021.03.05 05:59:02.701799 [ 2 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255014_3255014_0
2021.03.05 05:59:02.703216 [ 3 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262625_3262625_0
2021.03.05 05:59:02.703601 [ 2 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255015_3255015_0
2021.03.05 05:59:02.711143 [ 1 ] {} <Information> DatabaseOrdinary (default): Total 1 tables.
2021.03.05 05:59:02.722371 [ 33 ] {} <Debug> default.users: Loading data parts
2021.03.05 05:59:02.722679 [ 33 ] {} <Debug> default.users: Loaded data parts (0 items)
2021.03.05 05:59:02.723084 [ 1 ] {} <Information> DatabaseOrdinary (default): Starting up tables.
2021.03.05 05:59:02.723511 [ 1 ] {} <Information> DatabaseOrdinary (explame): Total 7 tables.
2021.03.05 05:59:02.750408 [ 28 ] {} <Debug> explame.test4: Loading data parts
2021.03.05 05:59:02.753645 [ 27 ] {} <Debug> explame.test1: Loading data parts
2021.03.05 05:59:02.763932 [ 35 ] {} <Debug> explame.test2: Loading data parts
2021.03.05 05:59:02.763999 [ 36 ] {} <Debug> explame.test3: Loading data parts
2021.03.05 05:59:08.919059 [ 27 ] {} <Debug> explame.test1: Loaded data parts (142 items)
2021.03.05 05:59:08.920688 [ 27 ] {} <Information> DatabaseOrdinary (explame): 14.29%
2021.03.05 05:59:08.962784 [ 27 ] {} <Debug> explame.test5: Loading data parts
2021.03.05 05:59:14.567633 [ 13 ] {} <Trace> system.query_thread_log: Found 1 old parts to remove.
2021.03.05 05:59:14.567929 [ 13 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255016_3255016_0
2021.03.05 05:59:19.255907 [ 35 ] {} <Debug> explame.test2: Loaded data parts (335 items)
2021.03.05 05:59:19.257468 [ 35 ] {} <Information> DatabaseOrdinary (explame): 28.57%
2021.03.05 05:59:19.304837 [ 35 ] {} <Debug> explame.test6: Loading data parts
2021.03.05 05:59:19.319496 [ 17 ] {} <Trace> system.query_log: Found 1 old parts to remove.
2021.03.05 05:59:19.319721 [ 17 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262626_3262626_0
2021.03.05 05:59:26.965308 [ 15 ] {} <Trace> system.query_thread_log: Found 1 old parts to remove.
2021.03.05 05:59:26.965508 [ 15 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255017_3255017_0
2021.03.05 05:59:37.970838 [ 18 ] {} <Trace> system.query_log: Found 2 old parts to remove.
2021.03.05 05:59:37.971108 [ 18 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262627_3262627_0
2021.03.05 05:59:37.973184 [ 18 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262628_3262628_0
2021.03.05 05:59:41.105041 [ 13 ] {} <Trace> system.query_thread_log: Found 1 old parts to remove.
2021.03.05 05:59:41.105274 [ 13 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255018_3255018_0
2021.03.05 05:59:43.987722 [ 35 ] {} <Debug> explame.test6: Loaded data parts (262 items)
2021.03.05 05:59:44.152232 [ 35 ] {} <Information> DatabaseOrdinary (explame): 42.86%
2021.03.05 05:59:44.211789 [ 35 ] {} <Debug> explame.test7: Loading data parts
2021.03.05 05:59:55.902546 [ 11 ] {} <Trace> system.query_thread_log: Found 2 old parts to remove.
2021.03.05 05:59:55.902765 [ 11 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3254895_3255019_26
2021.03.05 05:59:55.904371 [ 11 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255019_3255019_0
2021.03.05 06:00:06.001824 [ 35 ] {} <Debug> explame.test7: Loaded data parts (262 items)
2021.03.05 06:00:06.063837 [ 35 ] {} <Information> DatabaseOrdinary (explame): 57.14%
2021.03.05 06:00:10.553769 [ 17 ] {} <Trace> system.query_log: Found 5 old parts to remove.
2021.03.05 06:00:10.553995 [ 17 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262499_3262630_29
2021.03.05 06:00:10.563454 [ 17 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262629_3262629_0
2021.03.05 06:00:10.565406 [ 17 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262630_3262630_0
2021.03.05 06:00:10.567264 [ 17 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262631_3262631_0
2021.03.05 06:00:10.569026 [ 17 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262632_3262632_0
2021.03.05 06:00:15.081807 [ 12 ] {} <Trace> system.query_thread_log: Found 1 old parts to remove.
2021.03.05 06:00:15.082044 [ 12 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255020_3255020_0
2021.03.05 06:00:37.052114 [ 16 ] {} <Trace> system.query_thread_log: Found 2 old parts to remove.
2021.03.05 06:00:37.052396 [ 16 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255021_3255021_0
2021.03.05 06:00:37.054428 [ 16 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255022_3255022_0
2021.03.05 06:00:53.335462 [ 6 ] {} <Trace> system.query_log: Found 4 old parts to remove.
2021.03.05 06:00:53.335705 [ 6 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262499_3262635_30
2021.03.05 06:00:53.338159 [ 6 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262633_3262633_0
2021.03.05 06:00:53.340670 [ 6 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262634_3262634_0
2021.03.05 06:00:53.342518 [ 6 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262635_3262635_0
2021.03.05 06:01:01.178652 [ 19 ] {} <Trace> system.query_thread_log: Found 2 old parts to remove.
2021.03.05 06:01:01.178876 [ 19 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255023_3255023_0
2021.03.05 06:01:01.180608 [ 19 ] {} <Debug> system.query_thread_log: Removing part from filesystem 202103_3255024_3255024_0
2021.03.05 06:01:56.265103 [ 12 ] {} <Trace> system.query_log: Found 5 old parts to remove.
2021.03.05 06:01:56.265357 [ 12 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262636_3262636_0
2021.03.05 06:01:56.267590 [ 12 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262637_3262637_0
2021.03.05 06:01:56.269829 [ 12 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262638_3262638_0
2021.03.05 06:01:56.272174 [ 12 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262639_3262639_0
2021.03.05 06:01:56.274483 [ 12 ] {} <Debug> system.query_log: Removing part from filesystem 202103_3262640_3262640_0
2021.03.05 06:04:01.619671 [ 28 ] {} <Debug> explame.test4: Loaded data parts (8778 items)
2021.03.05 06:04:02.806392 [ 28 ] {} <Information> DatabaseOrdinary (explame): 71.43%
2021.03.05 06:06:31.866639 [ 36 ] {} <Debug> explame.test3: Loaded data parts (15602 items)
2021.03.05 06:06:32.398669 [ 36 ] {} <Information> DatabaseOrdinary (explame): 85.71%
2021.03.05 06:14:07.041855 [ 27 ] {} <Debug> explame.test5: Loaded data parts (61645 items)
2021.03.05 06:14:07.632061 [ 27 ] {} <Information> DatabaseOrdinary (explame): 100.00%
2021.03.05 06:14:07.632386 [ 1 ] {} <Information> DatabaseOrdinary (explame): Starting up tables.
2021.03.05 06:14:08.006335 [ 1 ] {} <Information> DatabaseOrdinary (test): Total 14 tables.
2021.03.05 06:14:08.039868 [ 34 ] {} <Debug> test.UAct: Loading data parts
2021.03.05 06:14:08.041662 [ 33 ] {} <Debug> test.UAct_version: Loading data parts
2021.03.05 06:14:08.047447 [ 34 ] {} <Debug> test.UAct: Loaded data parts (1 items)
2021.03.05 06:14:08.048244 [ 33 ] {} <Debug> test.UAct_version: Loaded data parts (1 items)
2021.03.05 06:14:08.068010 [ 37 ] {} <Debug> test.test: Loading data parts
2021.03.05 06:14:08.077577 [ 33 ] {} <Debug> test.union_refund_all: Loading data parts
2021.03.05 06:14:08.090287 [ 37 ] {} <Debug> test.test: Loaded data parts (1 items)
2021.03.05 06:14:08.091553 [ 3 ] {} <Debug> test.transactions_union_refunds_2: Loading data parts
2021.03.05 06:14:08.095395 [ 37 ] {} <Debug> test.vm_data: Loading data parts
2021.03.05 06:14:08.095669 [ 37 ] {} <Debug> test.vm_data: Loaded data parts (0 items)
2021.03.05 06:14:08.100116 [ 34 ] {} <Debug> test.transactions_union_refunds: Loading data parts
2021.03.05 06:14:08.100309 [ 33 ] {} <Debug> test.union_refund_all: Loaded data parts (1 items)
2021.03.05 06:14:08.136316 [ 3 ] {} <Debug> test.transactions_union_refunds_2: Loaded data parts (1 items)
2021.03.05 06:14:08.146138 [ 34 ] {} <Debug> test.transactions_union_refunds: Loaded data parts (1 items)
2021.03.05 06:14:08.148928 [ 1 ] {} <Information> DatabaseOrdinary (test): Starting up tables.
2021.03.05 06:14:08.151170 [ 1 ] {} <Debug> Application: Loaded metadata.
2021.03.05 06:14:08.153358 [ 1 ] {} <Trace> TraceCollector: Pipe capacity is 1.00 MiB
2021.03.05 06:14:08.159158 [ 1 ] {} <Information> BackgroundSchedulePool: Create BackgroundSchedulePool with 16 threads
2021.03.05 06:14:08.163477 [ 1 ] {} <Information> Application: Listening http://0.0.0.0:8123
2021.03.05 06:14:08.163672 [ 1 ] {} <Information> Application: Listening for connections with native protocol (tcp): 0.0.0.0:9000
2021.03.05 06:14:08.163827 [ 1 ] {} <Information> Application: Listening for replica communication (interserver) http://0.0.0.0:9009
2021.03.05 06:14:08.166815 [ 1 ] {} <Information> Application: Available RAM: 31.42 GiB; physical cores: 4; logical cores: 8.
2021.03.05 06:14:08.166939 [ 1 ] {} <Information> Application: Ready for connections.
2021.03.05 06:14:10.176710 [ 3 ] {} <Debug> ConfigReloader: Loading config '/etc/clickhouse-server/config.xml'
2021.03.05 06:14:14.788072 [ 44 ] {} <Trace> TCPHandlerFactory: TCP Request. Address: 127.0.0.1:45592
2021.03.05 06:14:14.793586 [ 44 ] {} <Debug> TCPHandler: Connected ClickHouse datadog-clickhouse:2ec9766687936699 version 18.10.0

clickhouse日志简单分析

BackgroundProcessingPool: Create BackgroundProcessingPool with 16 threads 分析
SELECT *
FROM system.metrics
LIMIT 10

┌─metric─────────────────────┬─value─┬─description────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Query                      │     1 │ Number of executing queries                                                                                                                                                                    │
│ Merge                      │     0 │ Number of executing background merges                                                                                                                                                          │
│ PartMutation               │     0 │ Number of mutations (ALTER DELETE/UPDATE)                                                                                                                                                      │
│ ReplicatedFetch            │     0 │ Number of data parts being fetched from replica                                                                                                                                                │
│ ReplicatedSend             │     0 │ Number of data parts being sent to replicas                                                                                                                                                    │
│ ReplicatedChecks           │     0 │ Number of data parts checking for consistency                                                                                                                                                  │
# 这里可以看在重启过程中,BackgroundProcessingPool开启线程重要工作之一就是merges
# system.query_thread_log: Removing part from filesystem  重启日志信息
│ BackgroundPoolTask         │     1 │ Number of active tasks in BackgroundProcessingPool (merges, mutations, fetches, or replication queue bookkeeping)                                                                              │
│ BackgroundSchedulePoolTask │     0 │ Number of active tasks in BackgroundSchedulePool. This pool is used for periodic ReplicatedMergeTree tasks, like cleaning old data parts, altering data parts, replica re-initialization, etc. │
│ DiskSpaceReservedForMerge  │     0 │ Disk space reserved for currently running background merges. It is slightly more than the total size of currently merging parts.                                                               │
│ DistributedSend            │     0 │ Number of connections to remote servers sending data that was INSERTed into Distributed tables. Both synchronous and asynchronous mode.                                                        │
└────────────────────────────┴───────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

分析重启过程中clickhouse对表加载过程
root@clickhouse:/var/lib/clickhouse/metadata/explame# cat test5.sql
ATTACH TABLE test5
(
    `line_items_id` Int32 DEFAULT CAST(0, 'Int32'),
    `id` Int32 DEFAULT line_items_id,
    `line_items_quantity` Float32 DEFAULT CAST(0, 'Float32'),
    `line_items_price` Float32 DEFAULT CAST(0, 'Float32'),
# 查看元数据可以得知加载表动作是ATTACH TABLE 

# ATTACH TABLE动作相关信息
is_initial_query:     1
user:                 default
query_id:             c1e48bb1-1946-4657-80b5-768371aa9a40
address:              ::ffff:127.0.0.1
port:                 48676
initial_user:         default
initial_query_id:     c1e48bb1-1946-4657-80b5-768371aa9a40
initial_address:      ::ffff:127.0.0.1
initial_port:         48676
interface:            1
os_user:              jasonchoi1031
client_hostname:      clickhouse
client_name:          ClickHouse client
client_revision:      54427
client_version_major: 19
client_version_minor: 16
client_version_patch: 3
http_method:          0
http_user_agent:
quota_key:
elapsed:              2.165810118
is_cancelled:         0
read_rows:            0
read_bytes:           0
total_rows_approx:    0
written_rows:         0
written_bytes:        0
memory_usage:         0
peak_memory_usage:    0
query:                ATTACH TABLE explame.test2 # size: 105.96 GiB  parts:  335
thread_numbers:       [45]  # 大概可以认为attach table 一张表,只有一个线程在后台处理
os_thread_ids:        [4705] 
ProfileEvents.Names:  ['Query','FileOpen','ReadBufferFromFileDescriptorRead','ReadBufferFromFileDescriptorReadBytes','ReadCompressedBytes','CompressedReadBufferBlocks','CompressedReadBufferBytes','IOBufferAllocs','IOBufferAllocBytes','DiskReadElapsedMicroseconds','ContextLock']
ProfileEvents.Values: [1,2,10,22376,36,1,10,3,1052791,2371,31]
Settings.Names:       ['max_threads','use_uncompressed_cache','load_balancing','log_queries','max_memory_usage','max_partitions_per_insert_block','compile']
Settings.Values:      ['64','0','random','1','60000000000','0','1']

clickhouse :) attach  table  explame.test2;
ATTACH TABLE explame.test2
Ok.
0 rows in set. Elapsed: 6.467 sec.

解决clickhouse快速启动思路
1. BackgroundProcessingPool size :16 表数量会影响启动时间
2. 禁止clickhouse启动时候加载大表,在clickhouse启动后,使用Attach Table动作手动加载
3. 禁止一些不太重要表加载system.query_log,system.query_thread_log这类表加载,往往这些表写入很大,待meger meger parts多。

clickhouse可插拔重启实践

为确保数据安全,首先detach tables,这时候system.parts active =0
clickhouse :) detach table test1;
0 rows in set. Elapsed: 0.037 sec.
clickhouse :) detach table test2;
0 rows in set. Elapsed: 0.071 sec.
clickhouse :) detach table test3;
0 rows in set. Elapsed: 3.398 sec.
clickhouse :) detach table test5;
0 rows in set. Elapsed: 19.309 sec.
clickhouse :) detach table test4;
0 rows in set. Elapsed: 1.679 sec.
clickhouse :) detach table test7;
0 rows in set. Elapsed: 0.164 sec.
clickhouse :) detach table test6;
0 rows in set. Elapsed: 0.082 sec.
clickhouse :)
clickhouse :) show tables;
0 rows in set. Elapsed: 0.004 sec.
修改clickhouse metadata数据,禁止clickhouse启动时候加载相关表
root@clickhouse: /var/lib/clickhouse/metadata/explame# mv ~/clickhouse/metadata/explame/* ~/clickhouse/metadata/explame/
root@clickhouse: /var/lib/clickhouse/metadata/explame# mv ../system ~/clickhouse/metadata/system/quer* ~/clickhouse/metadata/explame/
root@clickhouse: /var/lib/clickhouse/metadata/explame# ls -lt
root@clickhouse: /var/lib/clickhouse/metadata/explame# ls -lt ../system/query*
重启服务: system restart clickhouse-servser
State       Recv-Q Send-Q                                                                                     Local Address:Port                                                                                                    Peer Address:Port
LISTEN      0      20                                                                                             127.0.0.1:25                                                                                                                 *:*                   users:(("exim4",pid=1058,fd=3))
LISTEN      0      64                                                                                                     *:8123                                                                                                               *:*                   users:(("clickhouse-serv",pid=4662,fd=43))
LISTEN      0      128                                                                                            127.0.0.1:8126                                                                                                               *:*                   users:(("trace-agent",pid=761,fd=6))
LISTEN      0      64                                                                                                     *:9000                                                                                                               *:*                   users:(("clickhouse-serv",pid=4662,fd=44))
LISTEN      0      128                                                                                            127.0.0.1:5000                                                                                                               *:*                   users:(("agent",pid=759,fd=3))
LISTEN      0      128                                                                                            127.0.0.1:5001                                                                                                               *:*                   users:(("agent",pid=759,fd=7))
LISTEN      0      128                                                                                            127.0.0.1:6062                                                                                                               *:*                   users:(("process-agent",pid=763,fd=11))
LISTEN      0      64                                                                                                     *:9009                   

重启相关日志: clickhouse 没有加载explame 库下表和system下query_log 和query_thread_logs,启动速度很快。
2021.03.05 07:36:39.120474 [ 1 ] {} <Debug> Application: Waiting for current connections to close.
2021.03.05 07:36:39.436066 [ 1 ] {} <Information> Application: Closed all listening sockets.
2021.03.05 07:36:39.436114 [ 1 ] {} <Information> Application: Closed connections.
2021.03.05 07:36:39.437416 [ 1 ] {} <Information> Application: Shutting down storages.
2021.03.05 07:36:39.443807 [ 1 ] {} <Trace> BackgroundSchedulePool: Waiting for threads to finish.
2021.03.05 07:36:39.444073 [ 1 ] {} <Debug> Application: Shutted down storages.
2021.03.05 07:36:39.446395 [ 1 ] {} <Debug> Application: Destroyed global context.
2021.03.05 07:36:39.446734 [ 1 ] {} <Information> Application: shutting down
2021.03.05 07:36:39.446746 [ 1 ] {} <Debug> Application: Uninitializing subsystem: Logging Subsystem
2021.03.05 07:36:39.446822 [ 45 ] {} <Information> BaseDaemon: Stop SignalListener thread
2021.03.05 07:36:46.793074 [ 1 ] {} <Information> : Starting ClickHouse 19.16.3.6 with revision 54427
2021.03.05 07:36:46.795875 [ 1 ] {} <Information> Application: starting up
2021.03.05 07:36:46.799266 [ 1 ] {} <Debug> Application: rlimit on number of file descriptors is 500000
2021.03.05 07:36:46.799291 [ 1 ] {} <Debug> Application: Initializing DateLUT.
2021.03.05 07:36:46.799299 [ 1 ] {} <Trace> Application: Initialized DateLUT with time zone 'Etc/UTC'.
2021.03.05 07:36:46.804531 [ 1 ] {} <Debug> ConfigReloader: Loading config '/etc/clickhouse-server/users.xml'
2021.03.05 07:36:46.805895 [ 1 ] {} <Information> Application: Loading metadata from /var/lib/clickhouse/
2021.03.05 07:36:46.806071 [ 1 ] {} <Information> DatabaseOrdinary (system): Total 2 tables.
2021.03.05 07:36:46.811058 [ 2 ] {} <Information> BackgroundProcessingPool: Create BackgroundProcessingPool with 16 threads
2021.03.05 07:36:46.813421 [ 2 ] {} <Debug> system.part_log: Loading data parts
2021.03.05 07:36:46.813428 [ 3 ] {} <Debug> system.trace_log: Loading data parts
2021.03.05 07:36:46.830676 [ 3 ] {} <Debug> system.trace_log: Loaded data parts (47 items)
2021.03.05 07:36:46.837049 [ 2 ] {} <Debug> system.part_log: Loaded data parts (41 items)
2021.03.05 07:36:46.837591 [ 1 ] {} <Information> DatabaseOrdinary (system): Starting up tables.
2021.03.05 07:36:46.839092 [ 1 ] {} <Information> DatabaseOrdinary (default): Total 1 tables.
2021.03.05 07:36:46.846700 [ 24 ] {} <Debug> default.users: Loading data parts
2021.03.05 07:36:46.846932 [ 24 ] {} <Debug> default.users: Loaded data parts (0 items)
2021.03.05 07:36:46.847300 [ 1 ] {} <Information> DatabaseOrdinary (default): Starting up tables.
2021.03.05 07:36:46.847670 [ 1 ] {} <Information> DatabaseOrdinary (test): Total 14 tables.
2021.03.05 07:36:46.848160 [ 27 ] {} <Debug> test.UAct: Loading data parts
2021.03.05 07:36:46.848187 [ 3 ] {} <Debug> test.UAct_version: Loading data parts
2021.03.05 07:36:46.848748 [ 27 ] {} <Debug> test.UAct: Loaded data parts (1 items)
2021.03.05 07:36:46.849355 [ 3 ] {} <Debug> test.UAct_version: Loaded data parts (1 items)
2021.03.05 07:36:46.869922 [ 26 ] {} <Debug> test.union_refund_all: Loading data parts
2021.03.05 07:36:46.875874 [ 26 ] {} <Debug> test.union_refund_all: Loaded data parts (1 items)
2021.03.05 07:36:46.877758 [ 26 ] {} <Debug> test.vm_data: Loading data parts
2021.03.05 07:36:46.877962 [ 26 ] {} <Debug> test.vm_data: Loaded data parts (0 items)
2021.03.05 07:36:46.878359 [ 2 ] {} <Debug> test.test: Loading data parts
2021.03.05 07:36:46.882271 [ 2 ] {} <Debug> test.test: Loaded data parts (1 items)
2021.03.05 07:36:46.895322 [ 27 ] {} <Debug> test.transactions_union_refunds_2: Loading data parts
2021.03.05 07:36:46.902687 [ 27 ] {} <Debug> test.transactions_union_refunds_2: Loaded data parts (1 items)
2021.03.05 07:36:46.903517 [ 3 ] {} <Debug> test.transactions_union_refunds: Loading data parts
2021.03.05 07:36:46.910943 [ 3 ] {} <Debug> test.transactions_union_refunds: Loaded data parts (1 items)
2021.03.05 07:36:46.911946 [ 1 ] {} <Information> DatabaseOrdinary (test): Starting up tables.
2021.03.05 07:36:46.912436 [ 1 ] {} <Debug> Application: Loaded metadata.
2021.03.05 07:36:46.912583 [ 1 ] {} <Trace> TraceCollector: Pipe capacity is 1.00 MiB
2021.03.05 07:36:46.912665 [ 1 ] {} <Information> BackgroundSchedulePool: Create BackgroundSchedulePool with 16 threads
2021.03.05 07:36:46.914328 [ 1 ] {} <Information> Application: Listening http://0.0.0.0:8123
2021.03.05 07:36:46.914384 [ 1 ] {} <Information> Application: Listening for connections with native protocol (tcp): 0.0.0.0:9000
2021.03.05 07:36:46.914419 [ 1 ] {} <Information> Application: Listening for replica communication (interserver) http://0.0.0.0:9009
2021.03.05 07:36:46.914801 [ 1 ] {} <Information> Application: Available RAM: 31.42 GiB; physical cores: 4; logical cores: 8.

手动Attach tables

还原clickhouse metadata文件
root@clickhouse:/var/lib/clickhouse/metadata/explame# cp ~/clickhouse/metadata/explame/* ./
root@clickhouse:/var/lib/clickhouse/metadata/explame# ls -lt
total 212
-rw-r----- 1 root root 22375 Mar  5 07:41 test2.sql
-rw-r----- 1 root root 22578 Mar  5 07:41 test3.sql
-rw-r----- 1 root root 22361 Mar  5 07:41 test4.sql
-rw-r----- 1 root root 22387 Mar  5 07:41 test5.sql
-rw-r----- 1 root root 40423 Mar  5 07:41 test6.sql
-rw-r----- 1 root root 39013 Mar  5 07:41 test7.sql
-rw-r----- 1 root root  1461 Mar  5 07:41 query_log.sql
-rw-r----- 1 root root  1281 Mar  5 07:41 query_thread_log.sql
-rw-r----- 1 root root 28194 Mar  5 07:41 test1.sql
root@clickhouse:/var/lib/clickhouse/metadata/explame# mv query* ../system/
root@clickhouse:/var/lib/clickhouse/metadata/explame# ls -lt
total 204
-rw-r----- 1 root root 22375 Mar  5 07:41 test2.sql
-rw-r----- 1 root root 22578 Mar  5 07:41 test3.sql
-rw-r----- 1 root root 22361 Mar  5 07:41 test4.sql
-rw-r----- 1 root root 22387 Mar  5 07:41 test5.sql
-rw-r----- 1 root root 40423 Mar  5 07:41 test6.sql
-rw-r----- 1 root root 39013 Mar  5 07:41 test7.sql
-rw-r----- 1 root root 28194 Mar  5 07:41 test1.sql
root@clickhouse:/var/lib/clickhouse/metadata/explame#
root@clickhouse:/var/lib/clickhouse/metadata/explame#
root@clickhouse:/var/lib/clickhouse/metadata/explame# cd ..
root@clickhouse:/var/lib/clickhouse/metadata# cd system/
root@clickhouse:/var/lib/clickhouse/metadata/system# ls -lt
total 16
-rw-r----- 1 root       root       1461 Mar  5 07:41 query_log.sql
-rw-r----- 1 root       root       1281 Mar  5 07:41 query_thread_log.sql
-rw-r----- 1 clickhouse clickhouse  694 Apr 28  2020 part_log.sql
-rwxrwxrwx 1 clickhouse messagebus  351 Oct 10  2019 trace_log.sql

手动加载相关表格,Attach tables
root@clickhouse:/var/log/clickhouse-server# clickhouse_login.sh
ClickHouse client version 19.16.3.6 (official build).
clickhouse :) show tables from system like '%query%';
SHOW TABLES FROM system LIKE '%query%'
Ok.
0 rows in set. Elapsed: 0.005 sec.
clickhouse :) show tables from explame;
SHOW TABLES FROM explame
Ok.
0 rows in set. Elapsed: 0.002 sec.
root@clickhouse:/var/lib/clickhouse/metadata/system#
clickhouse :) attach table system.query_log;
ATTACH TABLE system.query_log
Received exception from server (version 19.16.3):
Code: 76. DB::Exception: Received from 127.0.0.1:9000. DB::ErrnoException. DB::ErrnoException: Cannot open file /var/lib/clickhouse/metadata/system/query_log.sql, errno: 13, strerror: Permission denied.
0 rows in set. Elased: 0.010 sec. # 注意权限
clickhouse :) attach table system.query_log;
ATTACH TABLE system.query_log
Ok.
0 rows in set. Elapsed: 0.120 sec.
clickhouse :) attach table system.query_thread_log;;
ATTACH TABLE system.query_thread_log
Ok.
0 rows in set. Elapsed: 0.070 sec.
clickhouse :) attach table explame.test1;
ATTACH TABLE explame.test1
Ok.
0 rows in set. Elapsed: 1.813 sec.
clickhouse :) attach table explame.test2;
ATTACH TABLE explame.test2
Ok.
0 rows in set. Elapsed: 3.411 sec.
clickhouse :) attach table explame.test4;
ATTACH TABLE explame.test4
Ok.
0 rows in set. Elapsed: 103.359 sec.
clickhouse :) attach table explame.test5; # items很多会出现timeout
ATTACH TABLE explame.test5
Timeout exceeded while receiving data from server. Waited for 300 seconds, timeout is 300 seconds.
Cancelling query.                    
# 处理方式:detach table    explame.test5 , 再次attach table     

clickhouse :) show tables from explame;
SHOW TABLES FROM explame
┌─name──┐
│ test1 │
│ test2 │
│ test3 │
│ test4 │
│ test5 │
│ test6 │
└───────┘
6 rows in set. Elapsed: 0.003 sec.
clickhouse :) show tables from system like 'query%';
SHOW TABLES FROM system LIKE 'query%'
┌─name─────────────┐
│ query_log        │
│ query_thread_log │
                                                       

实例只做 Attach table 动作系统监控

is_initial_query:     1
user:                 default
query_id:             c1e48bb1-1946-4657-80b5-768371aa9a40
address:              ::ffff:127.0.0.1
port:                 48676
initial_user:         default
initial_query_id:     c1e48bb1-1946-4657-80b5-768371aa9a40
initial_address:      ::ffff:127.0.0.1
initial_port:         48676
interface:            1
os_user:                jasonchoi1031
client_hostname:      clickhouse.internal
client_name:          ClickHouse client
client_revision:      54427
client_version_major: 19
client_version_minor: 16
client_version_patch: 3
http_method:          0
http_user_agent:
quota_key:
elapsed:              2.165810118
is_cancelled:         0
read_rows:            0
read_bytes:           0
total_rows_approx:    0
written_rows:         0
written_bytes:        0
memory_usage:         0
peak_memory_usage:    0
query:                ATTACH TABLE explame.test2 # size: 105.96 GiB  parts:  335
thread_numbers:       [45]  # 大概可以认为attach table 一张表,只有一个线程在后台处理
os_thread_ids:        [4705] 
ProfileEvents.Names:  ['Query','FileOpen','ReadBufferFromFileDescriptorRead','ReadBufferFromFileDescriptorReadBytes','ReadCompressedBytes','CompressedReadBufferBlocks','CompressedReadBufferBytes','IOBufferAllocs','IOBufferAllocBytes','DiskReadElapsedMicroseconds','ContextLock']
ProfileEvents.Values: [1,2,10,22376,36,1,10,3,1052791,2371,31]
Settings.Names:       ['max_threads','use_uncompressed_cache','load_balancing','log_queries','max_memory_usage','max_partitions_per_insert_block','compile']
Settings.Values:      ['64','0','random','1','60000000000','0','1']

root@clickhouse:/var/lib/clickhouse/metadata/explame# iostat -dxm 1 100
Linux 4.9.0-14-amd64 (clickhouse)   03/05/2021  _x86_64_    (8 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               2.15    83.12 2289.48   51.02     9.58     0.85     9.12    11.46    4.89    3.84   51.93   0.07  15.95

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00 15020.00    0.00    60.24     0.00     8.21    33.29    2.22    2.22    0.00   0.07  99.60

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00 14878.00    0.00    60.05     0.00     8.27    28.28    1.90    1.90    0.00   0.07  98.40

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00   159.00 15057.00   37.00    60.43     0.78     8.30    31.10    2.06    2.06    4.32   0.07  98.80



Tasks: 138 total,   1 running, 135 sleeping,   2 stopped,   0 zombie
%Cpu0  :  4.8 us, 21.3 sy,  0.0 ni,  5.2 id, 67.7 wa,  0.0 hi,  1.0 si,  0.0 st
%Cpu1  :  2.2 us,  9.4 sy,  0.0 ni, 54.7 id, 28.6 wa,  0.0 hi,  4.7 si,  0.3 st
%Cpu2  :  1.9 us,  4.8 sy,  0.0 ni, 72.3 id, 18.1 wa,  0.0 hi,  2.6 si,  0.3 st
%Cpu3  :  2.0 us, 11.3 sy,  0.0 ni, 53.3 id, 33.4 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  :  3.6 us, 11.9 sy,  0.0 ni, 46.7 id, 37.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  :  2.0 us,  5.6 sy,  0.0 ni, 68.9 id, 22.8 wa,  0.0 hi,  0.3 si,  0.3 st
%Cpu6  :  2.7 us, 11.6 sy,  0.0 ni, 47.5 id, 38.2 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :  2.7 us, 11.3 sy,  0.0 ni, 45.2 id, 40.9 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 32951380 total, 14256912 free,  1148848 used, 17545620 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 31324092 avail Mem


Total DISK READ :      60.52 M/s | Total DISK WRITE :       0.00 B/s
Actual DISK READ:      60.54 M/s | Actual DISK WRITE:       0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
 4753 be/4 clickhou   16.22 M/s    0.00 B/s  0.00 % 66.79 % clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid [SystemLogFlush]
 4758 be/4 clickhou   16.93 M/s    0.00 B/s  0.00 % 66.16 % clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid [AsyncBlockInput]
 4765 be/4 clickhou   13.47 M/s    0.00 B/s  0.00 % 65.71 % clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid [ParalInputsProc]
 4764 be/4 clickhou   13.91 M/s    0.00 B/s  0.00 % 64.69 % clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid [ParalInputsProc]
 1536 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % sshd: bindo1031 [priv]
    1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
1.从监控指标可以看出,attach table 动作很吃IO但是不消耗CPU,所以我们可以通过mutil disk策略提高attach table速度。
2. Attach table动作是单线程处理的

相关资料参考:

# https://altinity.com/blog/2020/5/12/sql-for-clickhouse-dba
# https://clickhouse.tech/docs/en/sql-reference/statements/attach/
# https://www.jianshu.com/p/5630faf06e08

对比两种启动方式

1. system start clickhouse 正常启动,有时候会很慢,20.x系列启动速度有很提升
2. 可插拔启动,在紧急情况可以帮助我们运维或者DBA解决棘手问题, 存在风险谨慎使用。

重要声明

上述方式是个人实际总结,没有得到官方认可,在官方资料没有提及。
这中处理方式是个人在结合实际生产故障和需求探索出来,我们团队已经在多个案例使用,效果很好。
请看到这篇博文同学切勿在自己生产使用,一切后果自行承担。
写很多笔记,今天头一次发表博文,希望可以给接触clickhouse同学一些启发。
本人接触clickhouse时间不长,水平有限有错误地方请指正,平日忙于应付996而奔波。

提示: clickhouse也可以实现database级别可插拔重启策略,有兴趣同学可以自己测试

上一篇 下一篇

猜你喜欢

热点阅读