clickhouse

使用ClickHouse一键接管MySQL数据分析

2018-02-04  本文已影响3045人  JackpGao

数据导入

第一组

# du出的表大小
5.5G    article_clientuser_sum.ibd

# ClickHouse操作语句
CREATE TABLE article_clientuser_sum
ENGINE = MergeTree
ORDER BY id AS
SELECT *
FROM mysql('host:port', 'db', 'article_clientuser_sum', 'user', 'password') 

# 耗时和平均速度
0 rows in set. Elapsed: 137.251 sec. Processed 18.59 million rows, 7.34 GB (135.43 thousand rows/s., 53.48 MB/s.)

第二组

# 另一个表
20G     xx_httpcode_minf.ibd

CREATE TABLE xx_httpcode_minf
ENGINE = MergeTree
ORDER BY id AS
SELECT *
FROM mysql('host:port', 'db', 'tb', 'user', 'password') 

# 不知道为啥这表这么快就导入了 貌似是行少,但是表的总大小大啊
0 rows in set. Elapsed: 44.389 sec. Processed 13.03 million rows, 1.44 GB (293.44 thousand rows/s., 32.35 MB/s.)

PK之count(*)

第一组

# 1800w

# ClickHouse
SELECT count(*)
FROM article_clientuser_sum
┌──count()─┐
│ 18587381 │
└──────────┘

1 rows in set. Elapsed: 0.033 sec. Processed 18.59 million rows, 74.35 MB (556.76 million rows/s., 2.23 GB/s.)


# MySQL
mysql> select count(*) from article_clientuser_sum ;
+----------+
| count(*) |
+----------+
| 18587381 |
+----------+
1 row in set (39.48 sec)

# 性能 1196X

第二组

# 1300w

# ClickHouse
SELECT count(*)
FROM xx_httpcode_minf
┌──count()─┐
│ 13025469 │
└──────────┘
1 rows in set. Elapsed: 0.032 sec. Processed 13.03 million rows, 52.10 MB (406.68 million rows/s., 1.63 GB/s.)

# MySQL
mysql> SELECT count(*)
    -> FROM xx_httpcode_minf;
+----------+
| count(*) |
+----------+
| 13025469 |
+----------+
1 row in set (1 min 46.87 sec)

# 性能 3340X

PK之复杂查询

第一组

# ClickHouse

SELECT SUM(size) AS size
FROM xx_network_flow
WHERE (date >= '2018-01-01') AND (date <= '2018-01-31') AND (netstat = 0) AND (project LIKE '保密%')

Row 1:
──────
size: 4132888693

1 rows in set. Elapsed: 0.039 sec. Processed 841.66 thousand rows, 9.46 MB (21.67 million rows/s., 243.70 MB/s.)


# MySQL
+------------+
| size       |
+------------+
| 4132888693 |
+------------+
1 row in set (2.34 sec)

# 性能 60X

# ClickHouse
┌─────size─┐
│ 76888224 │
└──────────┘

1 rows in set. Elapsed: 0.137 sec. Processed 841.66 thousand rows, 9.46 MB (6.13 million rows/s., 68.97 MB/s.)


# MySQL
+----------+
| size     |
+----------+
| 76888224 |
+----------+
1 row in set (2.86 sec)

# 性能 21X

第二组

# ClickHouse
SELECT
    project,
    idc,
    minf,
    http_code,
    sum(sumhit) AS num
FROM xx_httpcode_minf
WHERE (date = '2018-01-16') AND (httptype = 'download') AND \
(minf >= 0) AND (minf <= 288) AND \
(http_code IN ('200', '500', '404', '502', '503', '504'))
GROUP BY
    project,
    idc,
    minf,
    http_code
ORDER BY num DESC
LIMIT 3

┌─project─────────────────────────────────────┬─idc────┬─minf─┬─http_code─┬────num─┐
│ 域名1xxxx │ .1xx │  195 │ 200       │ 247522 │
│ 域名2xxxx │ .2xx │  185 │ 200       │ 246613 │
│ 域名3xxxx │ .3xx │  188 │ 200       │ 245808 │
└─────────────────────────────────────────────┴────────┴──────┴───────────┴────────┘

3 rows in set. Elapsed: 0.161 sec. Processed 13.03 million rows, 284.63 MB (80.94 million rows/s., 1.77 GB/s.)

# MySQL
+---------------------------------------------+--------+------+-----------+--------+
| project                                     | idc    | minf | http_code | num    |
+---------------------------------------------+--------+------+-----------+--------+
| 域名1xxxx| .1.xx |  195 | 200       | 247522 |
|  域名2xxxx | .2.xxx |  185 | 200       | 246613 |
|  域名3xxxx | .3xx |  188 | 200       | 245808 |
+---------------------------------------------+--------+------+-----------+--------+
3 rows in set (12.02 sec)

# 性能 75X

# ClickHouse

┌─project────────────────────────────┬─idc────┬─minf─┬─http_code─┬───num─┐
│  域名1 │ 1xxx│  154 │ 404       │ 10792 │
│  域名1 │ 2xxx │  155 │ 404       │ 10395 │
│ 域名1│ 3xxx │  272 │ 404       │ 10313 │
└────────────────────────────────────┴────────┴──────┴───────────┴───────┘

3 rows in set. Elapsed: 0.119 sec. Processed 13.03 million rows, 283.15 MB (109.10 million rows/s., 2.37 GB/s.)


# MySQL
+------------------------------------+--------+------+-----------+-------+
| project                            | idc    | minf | http_code | num   |
+------------------------------------+--------+------+-----------+-------+
|  域名1 | .1zz |  154 | 404       | 10792 |
|  域名1 | .3xx |  155 | 404       | 10395 |
|  域名1 | .3rr |  272 | 404       | 10313 |
+------------------------------------+--------+------+-----------+-------+
3 rows in set (2.19 sec)

# 性能 18X

压缩对比

表名 MySQL表容量 ClickHouse表容量 压缩倍数
article_clientuser_sum 5.5GB 1.2G 4.6
xx_httpcode_minf 20GB 243M 84
xx_network_flow 189MB 25M 7.56

风险

讨论

Reference

广告

上一篇 下一篇

猜你喜欢

热点阅读