记一次Linux的性能排查

2017-08-18  本文已影响0人  有效栈

服务器有6台腾讯云的机器。有一天无意随便登录一台使用vmstat命令查看CPU和内存的消耗情况:

[root@VM_26_210_centos ~]# vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0  41572 229160 399080 3666708    0    0     0    10    0    0  1  0 99  0  0
[root@VM_26_210_centos ~]# vmstat 2 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0  41572 230096 399080 3666820    0    0     0    10    0    0  1  0 99  0  0
[root@VM_26_210_centos ~]# vmstat 2
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0  41572 229880 399080 3666840    0    0     0    10    0    0  1  0 99  0  0
 0  0  41572 229748 399080 3666840    0    0     0    28  791 1221  1  0 99  0  0
 0  0  41572 229616 399080 3666840    0    0     0     0  895 1305  1  1 98  0  0
 0  0  41572 229368 399080 3666840    0    0     0  4542  801 1294  0  0 98  1  0
 0  0  41572 229376 399080 3666848    0    0     0    20  811 1251  1  1 99  0  0
 0  0  41572 229384 399080 3666848    0    0     0     0  745 1206  0  1 99  0  0
 0  0  41572 229376 399080 3666848    0    0     0   110  831 1298  1  0 99  0  0
 0  0  41572 229616 399080 3666852    0    0     0     0 1741 2634  2  1 97  0  0
 0  0  41572 229624 399080 3666852    0    0     0     4  769 1255  1  0 99  0  0
 

吓了我一跳:服务器是4核8G的内存。vmstat一看只有两百多兆了。说明内存已经不够。

然后腾讯云上的监控是这样的:

腾讯云.jpg

腾讯云监控显示的内存竟然是只使用了50%,这个时候我就很奇怪了。肯定是哪里有问题,于是我使用top命令查看了当前机器的状态:

[root@VM_26_210_centos ~]# top
top - 13:32:02 up 659 days,  3:06,  1 user,  load average: 0.00, 0.00, 0.00
Tasks: 136 total,   1 running, 135 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.9%us,  1.1%sy,  0.0%ni, 97.8%id,  0.2%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8059448k total,  7826428k used,   233020k free,   399080k buffers
Swap:  2097144k total,    41572k used,  2055572k free,  3668692k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                         
22539 root      20   0 8318m 1.8g  14m S  1.7 23.9 452:18.77 java                                                                                                                                                             
20117 root      20   0  7168 6332  660 S  0.7  0.1 179:56.95 sap1002                                                                                                                                                          
  873 root      20   0  246m 5476  812 S  0.3  0.1   6:46.02 rsyslogd                                                                                                                                                         
10618 root      20   0 37868  17m  984 S  0.3  0.2 153:26.54 secu-tcs-agent                                                                                                                                                   
14448 root      20   0 5590m 477m  12m S  0.3  6.1 107:43.34 java                                                                                                                                                             
16980 root      20   0 39016  22m 5576 S  0.3  0.3 293:10.41 sap1009                                                                                                                                                          
17857 root      20   0 4384m 452m  11m S  0.3  5.8 538:17.84 java                                                                                                                                                             
22349 root      20   0 5569m 467m  12m S  0.3  5.9  84:30.18 java                                                                                                                                                             
27931 root      20   0  427m  13m 2084 S  0.3  0.2 325:52.53 barad_agent                                                                                                                                                      
29121 root      20   0 33468  15m 1052 S  0.3  0.2  83:37.27 sap1005                                                                                                                                                          
    1 root      20   0 19356  932  716 S  0.0  0.0   2:21.78 init                                                                                                                                                             
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd                                                                                                                                                         
    3 root      RT   0     0    0    0 S  0.0  0.0   3:29.40 migration/0                                                                                                                                                      
    4 root      20   0     0    0    0 S  0.0  0.0   4:45.83 ksoftirqd/0                                                                                                                                                      
    5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0                                                                                                                                                      
    6 root      RT   0     0    0    0 S  0.0  0.0   1:13.64 watchdog/0                                                                                                                                                       
    7 root      RT   0     0    0    0 S  0.0  0.0   3:21.08 migration/1                                                                                                                                                      
    8 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/1                                                                                                                                                      
    9 root      20   0     0    0    0 S  0.0  0.0   4:10.62 ksoftirqd/1                                                                                                                                                      
   10 root      RT   0     0    0    0 S  0.0  0.0   0:58.95 watchdog/1                                                                                                                                                       
   11 root      RT   0     0    0    0 S  0.0  0.0   3:07.85 migration/2                                                                                                                                                      
   12 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/2                                                                                                                                                      
   13 root      20   0     0    0    0 S  0.0  0.0   4:19.19 ksoftirqd/2                                                                                                                                                      
   14 root      RT   0     0    0    0 S  0.0  0.0   1:00.61 watchdog/2                                                                                                                                                       
   15 root      RT   0     0    0    0 S  0.0  0.0   3:06.14 migration/3                                                                                                                                                      
   16 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/3                                                                                                                                                      
   17 root      20   0     0    0    0 S  0.0  0.0   5:30.66 ksoftirqd/3                                                                                                                                                      
   18 root      RT   0     0    0    0 S  0.0  0.0   1:00.14 watchdog/3                                                                                                                                                       
   19 root      20   0     0    0    0 S  0.0  0.0  26:36.90 events/0                                                                                                                                                         
   20 root      20   0     0    0    0 S  0.0  0.0  26:37.71 events/1                                                                                                                                                         
   21 root      20   0     0    0    0 S  0.0  0.0  33:52.49 events/2                                                                                                                                                         
   22 root      20   0     0    0    0 S  0.0  0.0  37:57.76 events/3                                                                                                                                                         
   23 root      20   0     0    0    0 S  0.0  0.0   0:00.00 cgroup                                                                                                                                                           
   24 root      20   0     0    0    0 S  0.0  0.0   0:11.72 khelper      

mem 行显示还是只有两百多兆的剩余内存。然后只查看内存:

[root@VM_26_210_centos ~]# free -m
             total       used       free     shared    buffers     cached
Mem:          7870       7625        245          0        389       3575
-/+ buffers/cache:       3660       4210
Swap:         2047         40       2007

这下确定了,肯定是腾讯云的监控使用问题的。

于是打电话给腾讯云。折腾了一下午,腾讯云反馈说他们的内存计算是不计算 buffer 和 cache的。

那么在vmstat中,buffer和cache到底是什么呢?
这里我直接引用http://www.cnblogs.com/chenshoubiao/p/4796664.html这篇博客:

A buffer is something that has yet to be "written" to disk.
A cache is something that has been "read" from the disk and stored for later use.
也就是说buffer是用于存放要输出到disk(块设备)的数据的,而cache是存放从disk上读入的数据。这二者是为了提高IO性能的,并由OS管理。

那么在vmstat中,用于输出的缓存的大概是三百多M,从硬盘读入的数据是则是3个多G。

那么真正被使用的内存就是差不多4个G作用。统计一下top命令中RES的和,是3.5个G。

这个时候就担心两个问题了:

从 vmstat来看,si (每秒从磁盘读入虚拟内存的大小,如果这个值大于0,表示物理内存不够用或者内存泄露了,要查找耗内存进程解决掉。我的机器内存充裕,一切正常)与
so (每秒虚拟内存写入磁盘的大小,如果这个值大于0,同上。)都是正常的。就是说没有发生分页交换,JVM在垃圾回收的时候要扫描所有的堆,如果发生分页交换,JVM回收垃圾的性能就会大大下降。
对比每两秒输出GC情况,通过 jstat命令来看垃圾回收也是正常的:

[root@VM_26_210_centos ~]# jstat -gccause  22539 2000 
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT    LGCC                 GCC                 
 87.12   0.00  91.04  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  91.39  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.45  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.57  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.58  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.65  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.83  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.84  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.84  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.86  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  92.92  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.07  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.08  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.09  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.58  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  93.65  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  94.36  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
 87.12   0.00  94.37  57.69  95.37  91.61  16912 1148.004     5    1.371 1149.375 Allocation Failure   No GC               
  0.00  84.94  40.97  57.69  95.37  91.61  16913 1148.072     5    1.371 1149.443 Allocation Failure   No GC        

http://www.cnblogs.com/kevingrace/p/5991604.html

http://blog.sina.com.cn/s/blog_9c6f23fb0102x1fg.html

从操作系统来讲,影响JVM性能有哪些因素?

1.页面交换
2.上下文切换

上一篇 下一篇

猜你喜欢

热点阅读