从堆栈角度看线程状态(7)

2018-08-01  本文已影响0人  hxj688699

异常信息

见过很多这样的人,写代码、调式代码的过程中,程序一旦出现异常,就很着急的去清空日志输出控制台,清空的异常日志信息,然后尝试各种重启方式。异常一直出现就开始嘀咕了,刚才还好好的啊、以前没出现过这问题啊......
没有仔细认真的看异常是很多新手容易犯的错误,不少老手也会存在这样的问题,不知道你是否躺枪。其实,认真的看异常问题就解决了一半了。异常信息是我们排查问题的方向。对于没有异常的、或是把异常吃掉的这种程序问题,问题定位、排查是相对困难的,至少问题定位不会那么直接,可能会把整个业务条线、甚至对整个系统体系进行分析才能摸到问题产生的原因。
可见异常信息对问题的排查提供了可靠的依据,异常信息反应的是程序调用的整个过程,调了哪个方法、是哪个实现类、卡在什么地方了......
异常信息是线程执行过程中出现异常时,输出的线程堆栈信息,下面我们看看线程各种状态对应的堆栈信息。

线程堆栈信息解读

堆栈状态WAITING与TIMED_WAITING:

调用以下三个方法会进入WAITING状态:

调用下面的方法会进入TIMED_WAITING状态:

RUNNABLE和BLOCKED状态

堆栈信息中不存在NEW和TERMINATED状态

看代码示例:

package net.zkbc.p2p;

import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;

/**
 * @author HXJ
 * @date 2018/8/1
 */
public class ThreadStack {
    public static void main(String[] args) {
        final Object blocker = new Object();
        //运行状态的线程
        Thread threadRun = new Thread(new Runnable() {
            @Override
            public void run() {
                while (true) {
                    //do something ...
                }
            }
        }, "Thread-run");

        //同步阻塞状态的线程
        Thread threadBlock = new Thread(new Runnable() {
            @Override
            public void run() {
                System.out.println(Thread.currentThread().getName() + " 同步锁阻塞 !");
                synchronized (blocker) {
                    System.out.println(Thread.currentThread().getName() + " 取得同步锁 run ... !");
                }
            }
        }, "Thread-block");

        //wait等待
        Thread threadWait = new Thread(new Runnable() {
            @Override
            public void run() {
                synchronized (this) {
                    System.out.println(Thread.currentThread().getName() + " wait ... !");
                    try {
                        this.wait();
                    } catch (InterruptedException e) {
                    }
                }
            }
        }, "Thread-wait");

        //wait timeout 等待
        Thread threadWaitTimeout = new Thread(new Runnable() {
            @Override
            public void run() {
                synchronized (this) {
                    System.out.println(Thread.currentThread().getName() + " wait ... !");
                    try {
                        this.wait(10 * 1000 * 60);
                    } catch (InterruptedException e) {
                    }
                }
            }
        }, "Thread-wait-timeout");

        //sleep 等待
        Thread threadSleepWaitTimeout = new Thread(new Runnable() {
            @Override
            public void run() {
                try {
                    System.out.println(Thread.currentThread().getName() + " wait ... !");
                    Thread.sleep(10 * 1000 * 60);
                } catch (InterruptedException e) {
                }
            }
        }, "Thread-sleep-wait-timeout");

        //lock.park 等待 为了显示相对应阻塞的堆栈信息
        Lock lock = new ReentrantLock();
        //lock获得锁后执行的线程
        Thread lockParkRun = new Thread(new Runnable() {
            @Override
            public void run() {
                System.out.println(Thread.currentThread().getName() + " lock-park run...");
                lock.lock();
                while (true) {
                    //do something ...
                }
            }
        }, "Thread-lock.park-run");
        //lock失败后park()阻塞的线程
        Thread lockPark = new Thread(new Runnable() {
            @Override
            public void run() {
                System.out.println(Thread.currentThread().getName() + " lock-park park ...");
                lock.lock();
            }
        }, "Thread-lock.park-park");

        try {
            synchronized (blocker) {
                System.out.println(threadRun.getName() + " 新建状态:" +threadRun.getState());
                threadRun.start();
                threadBlock.start();
                threadWait.start();
                threadWaitTimeout.start();
                threadSleepWaitTimeout.start();
                lockParkRun.start();
                lockParkRun.join(100);
                lockPark.start();
                //主线程不释放锁,并且通过等待阻塞线程的方式进行阻塞
                threadBlock.join();
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

运行结果:

Thread-run 新建状态:NEW
Thread-block 同步锁阻塞 !
Thread-wait wait ... !
Thread-wait-timeout wait ... !
Thread-sleep-wait-timeout wait ... !
Thread-lock.park-run lock-park run...
Thread-lock.park-park lock-park park ...

代码中threadRun线程在start()之前,状态是NEW,启动后在堆栈信息中为RUNNABLE状态。threadRun是个空死循环,测试代码的时候注意自己的电脑哦!
主线程创建完其他线程后,获取blocker对象上的同步锁,然后分别启动其他线程。threadBlock线程启动后,获取blocker对象上的同步锁,由于该锁已经被主线程持有,所有进入BLOCKED阻塞状态。其他线程进入相应状态,最后避免主线程退出,直接让它等待在threadBlock线程监视器上,由于threadBlock无法获得同步锁而无法执行,至此在其他超时等待的线程超时前有如下堆栈信息(部分):

"Thread-lock.park-park" #18 prio=5 os_prio=0 tid=0x00000000201a1800 nid=0x70f0 waiting on condition [0x0000000020fcf000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076bdf6070> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at net.zkbc.p2p.ThreadStack$7.run(ThreadStack.java:90)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

"Thread-lock.park-run" #17 prio=5 os_prio=0 tid=0x000000002018c000 nid=0x7740 runnable [0x0000000020ecf000]
   java.lang.Thread.State: RUNNABLE
    at net.zkbc.p2p.ThreadStack$6.run(ThreadStack.java:81)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - <0x000000076bdf6070> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

"Thread-sleep-wait-timeout" #16 prio=5 os_prio=0 tid=0x000000002018b000 nid=0x7a78 waiting on condition [0x0000000020dcf000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at net.zkbc.p2p.ThreadStack$5.run(ThreadStack.java:68)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

"Thread-wait-timeout" #15 prio=5 os_prio=0 tid=0x0000000020188800 nid=0x62d4 in Object.wait() [0x0000000020cce000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x000000076bded8b8> (a net.zkbc.p2p.ThreadStack$4)
    at net.zkbc.p2p.ThreadStack$4.run(ThreadStack.java:55)
    - locked <0x000000076bded8b8> (a net.zkbc.p2p.ThreadStack$4)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

"Thread-wait" #14 prio=5 os_prio=0 tid=0x0000000020187800 nid=0x759c in Object.wait() [0x0000000020bcf000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x000000076bdead90> (a net.zkbc.p2p.ThreadStack$3)
    at java.lang.Object.wait(Object.java:502)
    at net.zkbc.p2p.ThreadStack$3.run(ThreadStack.java:41)
    - locked <0x000000076bdead90> (a net.zkbc.p2p.ThreadStack$3)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

"Thread-block" #13 prio=5 os_prio=0 tid=0x0000000020150800 nid=0x4f70 waiting for monitor entry [0x0000000020acf000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at net.zkbc.p2p.ThreadStack$2.run(ThreadStack.java:29)
    - waiting to lock <0x000000076bb53ef0> (a java.lang.Object)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

"Thread-run" #12 prio=5 os_prio=0 tid=0x000000002014f800 nid=0x4f04 runnable [0x00000000209cf000]
   java.lang.Thread.State: RUNNABLE
    at net.zkbc.p2p.ThreadStack$1.run(ThreadStack.java:17)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None

"main" #1 prio=5 os_prio=0 tid=0x00000000036ac800 nid=0x779c in Object.wait() [0x000000000369f000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x000000076bb56140> (a java.lang.Thread)
    at java.lang.Thread.join(Thread.java:1252)
    - locked <0x000000076bb56140> (a java.lang.Thread)
    at java.lang.Thread.join(Thread.java:1326)
    at net.zkbc.p2p.ThreadStack.main(ThreadStack.java:106)
    - locked <0x000000076bb53ef0> (a java.lang.Object)

   Locked ownable synchronizers:
    - None

可以通过jstack命令获取堆栈信息文件,如:jstack -l pid > stack.log
我们可以清晰的从堆栈信息中看到每个线程当前的状态,新建、终止状态不在这里体现,也清晰的反应出程序调用的每一个过程。用好堆栈信息,很多问题便可迎刃而解!

Waiting for monitor entry 和 in Object.wait()

这两种阻塞情况是基于Monitor实现的,Monitor是Java中用以实现线程之间的互斥与协作的主要手段,它可以看成是对象或者 Class的锁。每一个对象有且仅有一个 monitor。如图:



从图中可以看出,每个 Monitor在任何时刻,只能被一个线程拥有,该线程就是 “Active Thread”,而其它线程都是 “Waiting Thread”,分别在两个队列 “ Entry Set”和 “Wait Set”里面等候。在 “Entry Set”中等待的线程状态是 “Waiting for monitorentry”,而在 “Wait Set”中等待的线程状态是“in Object.wait()”。
通过使用关键字synchronized来使用监视器,synchronized保护起来的代码段为临界区。当一个线程申请进入临界区时,它就进入了 “Entry Set”队列。如:

synchronized(obj){ 
    //临界区 do something ...
} 

当线程获得了 Monitor,进入了临界区之后,如果发现线程继续运行的条件没有满足,它则调用对象(一般就是被 synchronized 的对象)的 wait() 方法,放弃了 Monitor,进入 “Wait Set”队列。只有当别的线程在该对象上调用了 notify() 或者 notifyAll() , “ Wait Set”队列中线程才得到机会去竞争,但是只有一个线程获得对象的Monitor,恢复到运行态。

synchronized(obj){ 
     ...
    //未满足执行条件 wait ... 等待obj.notify();
    obj.wait();
} 

waiting on condition

sleep()、LockSupport.park()都属于这种阻塞,等待条件满足。

线程资源(cpu、内存)占用高案例分析

示例:
在linux环境下,通过命令:top -Hp pid
获得如图:


取将十进制16275对应的线程pid转换为16进制为3F93
用命令printf或计算器:
命令行:
printf '%x \n' PID
计算器:
进制转换
堆栈查找到对应nid=0x3f93的线程:
...

"Apollo-ReleaseMessageServiceWithCache-1" #25 daemon prio=5 os_prio=0 tid=0x00007fc4e4009800 nid=0x3f94 waiting on condition [0x00007fc51cca2000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000d33338d8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - None

"Apollo-AppNamespaceServiceWithCache-1" #24 daemon prio=5 os_prio=0 tid=0x00007fc4e44b9800 nid=0x3f93 runnable [0x00007fc51cce3000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000d33325e8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - None

"Apollo-DatabaseMessageSender-1" #23 daemon prio=5 os_prio=0 tid=0x00007fc4e4b01800 nid=0x3f92 sleeping[0x00007fc51cd24000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at java.lang.Thread.sleep(Thread.java:340)
    at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)
    at com.ctrip.framework.apollo.biz.message.DatabaseMessageSender.lambda$initialize$0(DatabaseMessageSender.java:78)
    at com.ctrip.framework.apollo.biz.message.DatabaseMessageSender$$Lambda$2/1620636712.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

...

找到对应线程后,进而查看对应线程的具体堆栈信息。

上一篇下一篇

猜你喜欢

热点阅读