区块链技术

基于BCH业务场景测试LevelDB和RocksDB

2018-05-15  本文已影响58人  wolf4j

leveldb VS rocksdb

背景

本文的测试场景是模拟BCH中,验证交易的过程中使用DB的场景,这种场景可以总结为脉冲式的读写和删除操作,也就是说,大约每10分钟会触发一次大规模的读操作(交易验证),写操作(交易插入/更新/删除),所以,测试过程中会把读操作和写操作进行分开测试。

本文的测试细节较多,读者可以review测试代码,以及使用本测试代码和相似的软硬件测试环境对最终的测试结果进行验证。

测试环境

硬件环境

$ cat /proc/cpuinfo | grep 'model name' | uniq
model name      : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz

$ cat /proc/meminfo | grep MemTotal
MemTotal:       16316464 kB


$ sudo smartctl -a /dev/sda
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-87-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     INTEL SSDSC2KW512G8
Serial Number:    BTLA750307HD512DGN
LU WWN Device Id: 5 5cd2e4 14ee7450c
Firmware Version: LHF002C
User Capacity:    512,110,190,592 bytes [512 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Apr 24 17:06:05 2018 HKT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x53) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  30) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       1266
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       12
170 Unknown_Attribute       0x0033   100   100   010    Pre-fail  Always       -       0
171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
173 Unknown_Attribute       0x0033   098   098   005    Pre-fail  Always       -       150329491458
174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       2
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0033   100   100   090    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   033   052   000    Old_age   Always       -       33 (Min/Max 19/52)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
225 Unknown_SSD_Attribute   0x0032   100   100   000    Old_age   Always       -       373785
226 Unknown_SSD_Attribute   0x0032   100   100   000    Old_age   Always       -       0
227 Unknown_SSD_Attribute   0x0032   100   100   000    Old_age   Always       -       0
228 Power-off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       0
232 Available_Reservd_Space 0x0033   100   100   010    Pre-fail  Always       -       0
233 Media_Wearout_Indicator 0x0032   098   098   000    Old_age   Always       -       0
236 Unknown_Attribute       0x0032   099   099   000    Old_age   Always       -       0
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       373785
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       184275
249 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       18778
252 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       35

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

软件环境

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:        16.04
Codename:       xenial

rocksdb version: 5.13.0
leveldb version: Release 1.19

小结

测试用例

编译二进制测试文件

rocksdb

leveldb

测试注意事项

(gdb) info break
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000000000413dd1 in leveldb::DB::Put(leveldb::WriteOptions const&, leveldb::Slice const&, leveldb::Slice const&) at db/db_impl.cc:1475
2       breakpoint     keep y   0x000000000043260e in leveldb::WriteBatch::Put(leveldb::Slice const&, leveldb::Slice const&) at db/write_batch.cc:99
3       breakpoint     keep y   0x0000000000413e85 in leveldb::DB::Delete(leveldb::WriteOptions const&, leveldb::Slice const&) at db/db_impl.cc:1481

10GB场景测试

生成测试数据python genKeyValue.py 10737418240 的到文件trx.bin,重命名为trx10GB.bin

leveldb

测试命令

执行单线程读写测试,输出如下:

now:Tue Apr 24 16:25:39 2018
read:90000 elapsed:20.8294
now:Tue Apr 24 16:25:39 2018
add:210000 delete:30000 elapsed:0.373496
 
now:Tue Apr 24 16:30:59 2018
read:90000 elapsed:19.425
now:Tue Apr 24 16:30:59 2018
add:210000 delete:30000 elapsed:0.337532

now:Tue Apr 24 16:36:19 2018
read:90000 elapsed:19.7889
now:Tue Apr 24 16:36:20 2018
add:210000 delete:30000 elapsed:0.348685

测试结果

10GB,leveldb使用默认option:
线程数量 耗时 cpu使用率 内存使用
1 37s-58s 70% 900MB
8 18-21s 左右 80% 750MB
16 17-18s 92% 780MB

10GB, leveldb使用max_open_files=11000:

线程数量 耗时 cpu使用率 内存使用
1 28s-40s 90% 800M
8 9.5s-11.7s 120% 900M
16 5s-6.5s 115% 900M
10GB,leveldb使用max_open_files=11000, bloom_filter(10):
线程数量 耗时 cpu使用率 内存使用
1 14s-21s 50% 1G
8 2.7s-3.5s 50% 1G
16 3.4-4.3s 100% 1G
小结

rocksdb

测试命令

10GB,rocksdb使用默认option

线程数量 耗时 cpu使用率 内存使用
1 10.8s-14s 10%-26% 100MB
8 1.9s-2.5s 30%-53% 120MB
16 1.7-1.9s 24%-45% 130MB

小结


50GB场景测试

生成测试数据 python genKeyValue.py 53687091200得到文件trx.bin,重命名为trx50GB.bin

leveldb

测试命令

见10GB使用场景,关闭了swap,并且占用13421772800字节的内存

50GB, leveldb使用默认option:

线程数量 内存 耗时 cpu利用率
1 600M 70s-100s左右 86%
8 680M 20-53s 左右 100%
16 750M 15-36s 左右 106%

50GB, leveldb使用option.max_open_file = 11000,bloom_filter(10):

线程数量 内存 耗时 cpu利用率
1 700M 34s-83s 91%
8 860M 6.4s-27s 120%
16 860M 4.8s-14.3s 130%

rocksdb

测试命令

见10GB使用场景,关闭了swap,并且占用13421772800字节的内存,防止被page cache占用

50GB, rocksdb使用默认option:

线程数量 内存 耗时 cpu利用率
1 500M 12s-16s 15%-20%
8 400M 3.2-3.7s 40%-60%
16 400M 2.1-2.4s 30%-%40

100GB 数据量场景测试

测试命令

total number of keys: 1210292943, total size in bytes: 107374182600

其中,1210292943表示生成的数据库中key的数量,也就是说它生成了一个包含1到1210292943的key的数据库pul100GB.db(这些整数在数据库中是40字节大端格式)。我们的测试程序需要在1~1210292943之间进行随机读取。

rocksdb

使用rocksdb默认参数:

线程数量 内存 耗时 cpu利用率
1 1.46g 12.8~16 11~26%
2 1.46g 6.9~7.3 18~31%
4 1.46g 4.6~4.95 73~91%
8 1.46g 2.7~2.9 80~92%
16 1.46g 1.75~1.76 21~38%
32 1.46g 1.53~1.55 19~51%
64 1.46g 1.56~1.58 23~37%

后来,使用如下参数,测试的性能和上表基本一致,没有性能提升,且内存占用大于14GB,且eatmem程序被kill了,看起来没法开启options.OptimizeForPointLookup:

   options.IncreaseParallelism();
   options.OptimizeForPointLookup(8);
   options.max_open_files = -1;
小结

leveldb

测试命令

优化option

线程数量 内存 耗时 cpu利用率
1 800M-850M 5.3-5.7s左右 15%-20%
8 900M-1GB 1-1.3左右 20%-30%
16 900M-1GB 1s左右 30%-%35左右

小结

总结

TODO


本文由copernicus团队冉小龙、齐巍合作完成。

上一篇 下一篇

猜你喜欢

热点阅读