Java线上定位常用命令
2017-12-09 本文已影响25人
落日无风
JAVA系统定位常用命令
遇到产品环境紧急问题,这是系统上线不可避免的事情。一般来说,需要抓取两个方面的数据:
- 操作系统相关数据;
- Java运行相关数据
本文列出相关的常用命令。
命令列表
操作系统
遇到线上问题,首先查看操作系统相关数据。
free
[deploy@perf-jesse-01w logs]$ free -m
total used free shared buffers cached
Mem: 2006 1990 15 0 24 476
-/+ buffers/cache: 1489 517
Swap: 1023 165 858
'free -m' 表示为M为单元查看内存使用情况。重点关注free, cached列。 可以看出系统的总共2G内存,buffer cache使用24M, page cached占用476M, 系统应用可使用内存517M。Swap空间使用165M,剩下858M。
根据经验,M=应用程序可使用内存/总物理内存,如果M > 70%表示内存充足;20% < M < 70%, 表示内存可用;M < 20%, 表示内存紧张。可以看出,上面的应用内存紧张。
vmstat
[deploy@perf-jesse-01w logs]$ vmstat 2 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 169708 16228 26564 488576 0 0 13 3 0 1 1 0 99 0 0
- proc
- r 表示等待cpu表的进程数
- b 表示等待资源的进程数
- memory
- swpd 表示切换到内存交换区的数量
- free 表示空闲的物理内存数量
- buff 表示buffers cache的内存数量
- cached 表示page cache的内存数量,如果该值越大,说明cache到内存中文件越多
- swap
-si 由内存进入内存交换区的数量
-so 由内存交换区进入内存的数量 - io
- bi 读磁盘(kb/s)
- bo 写磁盘(kb/s)
一般经验,bi + bo参考值为1000,如果超过1000,且wa比较大说明磁盘I/O有问题。
- system
- in 每秒设备中断数
- cs 每秒上下文切换数
- cpu
- us 用户进程消耗CPU时间比
- uy 内核进程消耗CPU时间比
一般经验,如果us + uy > 80%, 则说明CPU紧张
iostat
ping
$ ping www.sina.com.cn
PING spool.grid.sinaedge.com (218.30.66.248): 56 data bytes
64 bytes from 218.30.66.248: icmp_seq=0 ttl=55 time=18.282 ms
64 bytes from 218.30.66.248: icmp_seq=1 ttl=55 time=18.430 ms
64 bytes from 218.30.66.248: icmp_seq=2 ttl=55 time=19.967 ms
JAVA运行
查看JVM相关数据。
jstack
Jesse-4:~$ jstack -l 11745
2017-12-09 20:33:01
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.31-b07 mixed mode):
"Attach Listener" #18 daemon prio=9 os_prio=31 tid=0x00007fc66c81d800 nid=0x4107 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Timer-0" #16 prio=5 os_prio=31 tid=0x00007fc66dfc8000 nid=0x6903 in Object.wait() [0x000000011de3e000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007ab0fa120> (a java.util.TaskQueue)
at java.lang.Object.wait(Object.java:502)
at java.util.TimerThread.mainLoop(Timer.java:526)
- locked <0x00000007ab0fa120> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:505)
Locked ownable synchronizers:
- None
"GC Daemon" #15 daemon prio=2 os_prio=31 tid=0x00007fc66aa19800 nid=0x6703 in Object.wait() [0x000000011d810000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000007aafbc188> (a sun.misc.GC$LatencyLock)
at sun.misc.GC$Daemon.run(GC.java:117)
- locked <0x00000007aafbc188> (a sun.misc.GC$LatencyLock)
Locked ownable synchronizers:
- None
"RMI TCP Accept-0" #14 daemon prio=5 os_prio=31 tid=0x00007fc669219000 nid=0x6303 runnable [0x000000011ca84000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:404)
at java.net.ServerSocket.implAccept(ServerSocket.java:545)
at java.net.ServerSocket.accept(ServerSocket.java:513)
at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:400)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:372)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- None
"RMI TCP Accept-9430" #13 daemon prio=5 os_prio=31 tid=0x00007fc66a320800 nid=0x6103 runnable [0x000000011c981000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:404)
at java.net.ServerSocket.implAccept(ServerSocket.java:545)
at java.net.ServerSocket.accept(ServerSocket.java:513)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:400)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:372)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- None
"RMI TCP Accept-0" #12 daemon prio=5 os_prio=31 tid=0x00007fc66a320000 nid=0x6007 runnable [0x000000011c87e000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:404)
at java.net.ServerSocket.implAccept(ServerSocket.java:545)
at java.net.ServerSocket.accept(ServerSocket.java:513)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:400)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:372)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- None
"Service Thread" #10 daemon prio=9 os_prio=31 tid=0x00007fc66a034000 nid=0x5903 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
我们使用jstack来查看应用系统线程相关数据, 如果线程发送死锁,通过该命令产生的数据可以观察到。
jmap
jmap -dump:format=b,file=/tmp/heap.hprof $pid
使用该命令获取dump文件
jstat
[deploy@perf-jesse-01w tmp]$ /usr/java/default/bin/jstat -gccause 18912
S0 S1 E O P YGC YGCT FGC FGCT GCT LGCC GCC
0.00 50.00 58.22 88.69 67.44 39692 74.373 10 11.770 86.143 unknown GCCause No GC
-YGC minor gc次数
-YGCT minor gc耗时
-FGC full gc次数
-FGCT full gc耗时
jinfo
Jesse-4:tools$ jinfo 11745
Attaching to process ID 11745, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.31-b07
Java System Properties:
java.vendor = Oracle Corporation
sun.java.launcher = SUN_STANDARD
etty.maxThreads = 200
java.vm.specification.vendor = Oracle Corporation
java.runtime.version = 1.8.0_31-b13
输出java运行时的环境变量和JVM参数