ElasticSearch 运维

5、ES Rally demo

2019-12-10  本文已影响0人  MasonChan

机器配置(OpenStack)

声明环境变量

export JAVA_HOME=/apps/svr/jdk-12.0.1
export PATH=/apps/svr/jdk-12.0.1/bin:$PATH
export PATH=/apps/svr/python-3.5.2/bin:$PATH
export PATH=/apps/svr/git/bin:/apps/svr/git/libexec/git-core:$PATH

压测 demo 参数

执行压测命令

esrally --distribution-version=5.5.2 --track=geopoint

运行日志

    ____        ____
   / __ \____ _/ / /_  __
  / /_/ / __ `/ / / / / /
 / _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
                /____/

[INFO] Preparing for race ...
[INFO] Racing on track [geopoint], challenge [append-no-conflicts] and car ['defaults'] with version [5.5.2].

Running delete-index                                                           [100% done]
Running create-index                                                           [100% done]
Running check-cluster-health                                                   [100% done]
Running index-append                                                           [100% done]
Running refresh-after-index                                                    [100% done]
Running force-merge                                                            [100% done]
Running refresh-after-force-merge                                              [100% done]
Running polygon                                                                [100% done]
Running bbox                                                                   [100% done]
Running distance                                                               [100% done]
Running distanceRange                                                          [100% done]

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------

|   Lap |                                                         Metric |          Task |       Value |   Unit |
|------:|---------------------------------------------------------------:|--------------:|------------:|-------:|
|   All |                     Cumulative indexing time of primary shards |               |     34.2505 |    min |
|   All |             Min cumulative indexing time across primary shards |               |     6.63023 |    min |
|   All |          Median cumulative indexing time across primary shards |               |     6.83152 |    min |
|   All |             Max cumulative indexing time across primary shards |               |     7.12722 |    min |
|   All |            Cumulative indexing throttle time of primary shards |               |           0 |    min |
|   All |    Min cumulative indexing throttle time across primary shards |               |           0 |    min |
|   All | Median cumulative indexing throttle time across primary shards |               |           0 |    min |
|   All |    Max cumulative indexing throttle time across primary shards |               |           0 |    min |
|   All |                        Cumulative merge time of primary shards |               |     25.9476 |    min |
|   All |                       Cumulative merge count of primary shards |               |         416 |        |
|   All |                Min cumulative merge time across primary shards |               |     4.70943 |    min |
|   All |             Median cumulative merge time across primary shards |               |     5.15057 |    min |
|   All |                Max cumulative merge time across primary shards |               |     5.81993 |    min |
|   All |               Cumulative merge throttle time of primary shards |               |     4.27752 |    min |
|   All |       Min cumulative merge throttle time across primary shards |               |    0.720717 |    min |
|   All |    Median cumulative merge throttle time across primary shards |               |    0.808833 |    min |
|   All |       Max cumulative merge throttle time across primary shards |               |     1.03858 |    min |
|   All |                      Cumulative refresh time of primary shards |               |     6.58592 |    min |
|   All |                     Cumulative refresh count of primary shards |               |        2428 |        |
|   All |              Min cumulative refresh time across primary shards |               |       1.252 |    min |
|   All |           Median cumulative refresh time across primary shards |               |      1.3497 |    min |
|   All |              Max cumulative refresh time across primary shards |               |     1.35478 |    min |
|   All |                        Cumulative flush time of primary shards |               |      0.1466 |    min |
|   All |                       Cumulative flush count of primary shards |               |          15 |        |
|   All |                Min cumulative flush time across primary shards |               |   0.0212833 |    min |
|   All |             Median cumulative flush time across primary shards |               |   0.0315167 |    min |
|   All |                Max cumulative flush time across primary shards |               |   0.0388833 |    min |
|   All |                                               Median CPU usage |               |       300.5 |      % |
|   All |                                             Total Young Gen GC |               |     105.401 |      s |
|   All |                                               Total Old Gen GC |               |      10.115 |      s |
|   All |                                                     Store size |               |     2.97451 |     GB |
|   All |                                                  Translog size |               | 2.00234e-07 |     GB |
|   All |                                                     Index size |               |     2.97451 |     GB |
|   All |                                                  Total written |               |      29.766 |     GB |
|   All |                                         Heap used for segments |               |     13.3071 |     MB |
|   All |                                       Heap used for doc values |               |  0.00948334 |     MB |
|   All |                                            Heap used for terms |               |     11.2716 |     MB |
|   All |                                            Heap used for norms |               |           0 |     MB |
|   All |                                           Heap used for points |               |    0.582964 |     MB |
|   All |                                    Heap used for stored fields |               |     1.44304 |     MB |
|   All |                                                  Segment count |               |          96 |        |
|   All |                                                 Min Throughput |  index-append |     68976.4 | docs/s |
|   All |                                              Median Throughput |  index-append |     72087.8 | docs/s |
|   All |                                                 Max Throughput |  index-append |     75291.9 | docs/s |
|   All |                                        50th percentile latency |  index-append |     528.702 |     ms |
|   All |                                        90th percentile latency |  index-append |     782.233 |     ms |
|   All |                                        99th percentile latency |  index-append |     1167.04 |     ms |
|   All |                                      99.9th percentile latency |  index-append |     1962.36 |     ms |
|   All |                                     99.99th percentile latency |  index-append |     2501.22 |     ms |
|   All |                                       100th percentile latency |  index-append |      2634.9 |     ms |
|   All |                                   50th percentile service time |  index-append |     528.702 |     ms |
|   All |                                   90th percentile service time |  index-append |     782.233 |     ms |
|   All |                                   99th percentile service time |  index-append |     1167.04 |     ms |
|   All |                                 99.9th percentile service time |  index-append |     1962.36 |     ms |
|   All |                                99.99th percentile service time |  index-append |     2501.22 |     ms |
|   All |                                  100th percentile service time |  index-append |      2634.9 |     ms |
|   All |                                                     error rate |  index-append |           0 |      % |
|   All |                                                 Min Throughput |       polygon |        2.01 |  ops/s |
|   All |                                              Median Throughput |       polygon |        2.01 |  ops/s |
|   All |                                                 Max Throughput |       polygon |        2.01 |  ops/s |
|   All |                                        50th percentile latency |       polygon |     93.6485 |     ms |
|   All |                                        90th percentile latency |       polygon |     99.4864 |     ms |
|   All |                                        99th percentile latency |       polygon |     109.385 |     ms |
|   All |                                       100th percentile latency |       polygon |     110.976 |     ms |
|   All |                                   50th percentile service time |       polygon |        93.2 |     ms |
|   All |                                   90th percentile service time |       polygon |      99.042 |     ms |
|   All |                                   99th percentile service time |       polygon |     108.945 |     ms |
|   All |                                  100th percentile service time |       polygon |     110.524 |     ms |
|   All |                                                     error rate |       polygon |           0 |      % |
|   All |                                                 Min Throughput |          bbox |        2.01 |  ops/s |
|   All |                                              Median Throughput |          bbox |        2.01 |  ops/s |
|   All |                                                 Max Throughput |          bbox |        2.01 |  ops/s |
|   All |                                        50th percentile latency |          bbox |     98.1866 |     ms |
|   All |                                        90th percentile latency |          bbox |     103.392 |     ms |
|   All |                                        99th percentile latency |          bbox |     119.742 |     ms |
|   All |                                       100th percentile latency |          bbox |     122.896 |     ms |
|   All |                                   50th percentile service time |          bbox |     97.7447 |     ms |
|   All |                                   90th percentile service time |          bbox |     102.939 |     ms |
|   All |                                   99th percentile service time |          bbox |     119.302 |     ms |
|   All |                                  100th percentile service time |          bbox |      122.41 |     ms |
|   All |                                                     error rate |          bbox |           0 |      % |
|   All |                                                 Min Throughput |      distance |        5.02 |  ops/s |
|   All |                                              Median Throughput |      distance |        5.02 |  ops/s |
|   All |                                                 Max Throughput |      distance |        5.02 |  ops/s |
|   All |                                        50th percentile latency |      distance |     18.3639 |     ms |
|   All |                                        90th percentile latency |      distance |     19.5332 |     ms |
|   All |                                        99th percentile latency |      distance |     23.3447 |     ms |
|   All |                                       100th percentile latency |      distance |     24.0361 |     ms |
|   All |                                   50th percentile service time |      distance |     18.1461 |     ms |
|   All |                                   90th percentile service time |      distance |     19.2916 |     ms |
|   All |                                   99th percentile service time |      distance |     23.1039 |     ms |
|   All |                                  100th percentile service time |      distance |     23.8031 |     ms |
|   All |                                                     error rate |      distance |           0 |      % |
|   All |                                                 Min Throughput | distanceRange |        0.42 |  ops/s |
|   All |                                              Median Throughput | distanceRange |        0.42 |  ops/s |
|   All |                                                 Max Throughput | distanceRange |        0.42 |  ops/s |
|   All |                                        50th percentile latency | distanceRange |      181642 |     ms |
|   All |                                        90th percentile latency | distanceRange |      208871 |     ms |
|   All |                                        99th percentile latency | distanceRange |      215444 |     ms |
|   All |                                       100th percentile latency | distanceRange |      216281 |     ms |
|   All |                                   50th percentile service time | distanceRange |     2347.54 |     ms |
|   All |                                   90th percentile service time | distanceRange |     2523.76 |     ms |
|   All |                                   99th percentile service time | distanceRange |     2631.35 |     ms |
|   All |                                  100th percentile service time | distanceRange |     2635.94 |     ms |
|   All |                                                     error rate | distanceRange |           0 |      % |

----------------------------------
[INFO] SUCCESS (took 2022 seconds)
----------------------------------

in-memory 类型的压测报告,除了打印到标准输出,本地还会有一份 json 格式的

~/.rally/benchmarks/races/2019-07-10-11-42-55/race.json

压测结束后,ES 的安装文件和导入的样本数据都会被删除掉,只保留 json 格式的压测报告、logs 日志,如果发生 heap dump,还有 heap dump 文件。

压测时集群信息

health 信息

curl http://localhost:39200/_cat/health?v

epoch      timestamp cluster         status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1562759244 19:47:24  rally-benchmark green           1         1      5   5    0    0        0             0                  -                100.0%

节点信息

curl http://localhost:39200/_cat/nodes?v

ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
127.0.0.1           57          73  95    2.55    0.62     0.24 mdi       *      rally-node-0

索引信息

curl http://localhost:39200/_cat/indices?v

health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   osmgeopoints ccz8JXAFSzOY7aVSbftGpA   5   0    6086527            0      306mb          306mb

分片信息

curl http://localhost:39200/_cat/shards?v

index        shard prirep state      docs   store ip        node
osmgeopoints 1     p      STARTED 5105924 267.3mb 127.0.0.1 rally-node-0
osmgeopoints 3     p      STARTED 5106465 380.2mb 127.0.0.1 rally-node-0
osmgeopoints 4     p      STARTED 5107993 300.5mb 127.0.0.1 rally-node-0
osmgeopoints 2     p      STARTED 5110831 264.7mb 127.0.0.1 rally-node-0
osmgeopoints 0     p      STARTED 5101242 308.1mb 127.0.0.1 rally-node-0

operations 不同阶段的系统负载

index-append 阶段的机器负载,cpu 和 write 负载高

top

top - 19:54:02 up 145 days,  7:37,  3 users,  load average: 9.09, 7.72, 4.29
Tasks: 161 total,   3 running, 157 sleeping,   0 stopped,   1 zombie
%Cpu0  : 78.8 us,  4.0 sy,  0.0 ni,  4.0 id, 13.1 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  : 74.9 us,  7.0 sy,  0.0 ni,  5.7 id, 12.4 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 76.2 us,  7.7 sy,  0.0 ni,  5.0 id, 11.1 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 81.3 us,  6.7 sy,  0.0 ni,  2.3 id,  9.7 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  4047248 total,   281884 free,  1846048 used,  1919316 buff/cache
KiB Swap:  1048572 total,   218180 free,   830392 used.  1858548 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                      
31818 apps      20   0 7308828 1.608g 261720 S 288.7 41.7  29:50.00 java                                                                                                                                         
31904 apps      20   0  326504  35088   1868 S   6.3  0.9   0:59.64 esrally                                                                                                                                      
31899 apps      20   0  326116  35004   1876 S   5.3  0.9   0:59.54 esrally                                                                                                                                      
31900 apps      20   0  326116  34804   1868 S   5.3  0.9   0:59.24 esrally                                                                                                                                      
31903 apps      20   0  326504  35280   1868 S   5.3  0.9   0:59.65 esrally                                                                                                                                      
31901 apps      20   0  326504  35172   1868 S   5.0  0.9   0:59.60 esrally                                                                                                                                      
31902 apps      20   0  326504  35096   1872 S   5.0  0.9   0:59.42 esrally                                                                                                                                      
31905 apps      20   0  326504  35108   1868 S   5.0  0.9   0:59.48 esrally                                                                                                                                      
31906 apps      20   0  326504  35104   1868 S   5.0  0.9   0:59.31 esrally 

free -m

              total        used        free      shared  buff/cache   available
Mem:           3952        1802         155          71        1993        1814
Swap:          1023         811         212

dstat

You did not select any stats, using -cdngy by default.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  0   0 100   0   0   0|3351B   12k|   0     0 |  61B  192B| 364   567 
 76   4  12   8   0   0|   0   109M|3683B 1413B|   0     0 |4685  2438 
 90   7   3   1   0   0|4892k   11M|3200B  564B|   0  4096B|6157  3244 
 89   5   5   1   0   0|8260k   11M|  34k  981B|   0     0 |6579  3560 
 90   6   4   1   0   0|7836k   11M|3290B  580B|   0     0 |5909  3385 
 70   8  11  11   0   0|9484k   72M|3441B  981B|   0     0 |6052  2591 
 94   4   3   0   0   0|2888k 6814k|3270B  580B|   0     0 |5273  2442 
 84   7   6   3   0   0|7128k  146M|3980B  981B|   0    40k|7980  2593 
 90   6   3   1   0   0|7492k   10M|  41k  580B|   0     0 |8111  3660 
 92   5   2   0   0   0|2788k 8568k|  55k  981B|   0     0 |5865  2845 
 92   6   2   0   0   0|4096B 8854k|3105B  580B|   0  4096B|6129  3111 
 82   6   4   8   0   0|9920k  111M|3237B  981B|   0    44k|8948  2879

其他阶段的负载,包括:delete-index、create-index、check-cluster-health、refresh-after-index、force-merge、refresh-after-force-merge、polygon、bbox、distance

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
 15   0  85   0   0   0|   0    10k|3671B  981B|   0     0 | 913   615 
 15   0  85   0   0   0|   0    11k|2822B 1060B|   0     0 | 872   561 
 16   0  84   0   0   0|   0    10k|3781B  965B|   0     0 | 953   665 
 15   0  85   0   0   0|   0    11k|3105B  564B|   0     0 | 917   619 
 17   0  83   0   0   0|  96k   10k|  49k  899B|  96k    0 |1370   809 
 15   0  85   0   0   0|   0    11k|  36k  268B|   0     0 |1141   626 
 15   0  85   0   0   0|   0    10k|  17k 1277B|   0     0 | 917   610 
 16   0  84   0   0   0|   0    11k|3008B  428B|   0     0 |1048   779 
 15   0  85   0   0   0|   0    10k|3334B 1101B|   0     0 | 961   692 
 16   0  84   0   0   0|   0    11k|  35k  428B|   0     0 |1154   637

distanceRange 阶段的负载

dstat

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  4   0  96   0   0   0|   0    10k|3700B 1303B|   0     0 | 603   799 
 47   1  51   1   0   0|  71M   11k|3207B  924B|   0     0 |2982  1062 
 97   3   1   0   0   0| 172M   10k|3622B 1117B|   0     0 |6292   751 
 99   1   1   0   0   0|  47M   11k|3246B  700B|   0     0 |4772   793 
 99   1   0   0   0   0|  22M   10k|  35k  969B|   0     0 |4745   728 
 86   1  11   2   0   0|  62M   11k|  34k  444B|   0     0 |5272   887 
 31   1  67   1   0   0|  33M   10k|  21k 1117B|   0     0 |1960   797 
 92   0   8   0   0   0|1640k   34k|3071B  444B|   0     0 |4035   673 
100   0   0   0   0   0|   0  8192B|3117B 1117B|   0     0 |4288   652 
 76   0  24   0   0   0|   0    15k|  37k  428B|   0     0 |3523   695 
100   0   0   0   0   0|   0    10k|3473B 1101B|   0     0 |4327   706

小结:本次压测高负载有 2 个阶段 index-append 和 distanceRange。一个是数据导入,一个是范围查询。

上一篇下一篇

猜你喜欢

热点阅读