51 多种技巧完成线程死锁的定位与排查
Diego38 17 1

1. 前言

从本节开始学习新的一章,多线程问题定位和诊断。

日常我们编写完代码,需要借助一些工具来完成测试和监控,线上出现问题可以通过工具去诊断和排查,本节就从死锁入手来熟悉一些基本的工具。

2. 通过JDK命令Jstack定位死锁

Jstack是定位线程问题最常见的命令,它可以将线程状态信息输出到控制台上,此过程称作线程dump。

使用方式如下:

jstack [-l] pid

构造死锁代码来演示下:

public class DeadLockTest {

    public static void main(String[] args) {
        Lock lock1 = new ReentrantLock();
        Lock lock2 = new ReentrantLock();

        ExecutorService fixedThreadPool = Executors.newFixedThreadPool(2);
        fixedThreadPool.submit(() -> {
           lock1.lock();
           try {
               Thread.sleep(1000);
           } catch (InterruptedException e) {}
           lock2.lock();
           try {


           }finally {
               lock2.unlock();
               lock1.unlock();
           }
        });

        fixedThreadPool.submit(() -> {
            lock2.lock();
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {}
            lock1.lock();
            try {


            }finally {
                lock1.unlock();
                lock2.unlock();
            }
        });
    }
}

先通过jps命令找到进程id,然后执行jstack -l

Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.201-b09 mixed mode):

"Attach Listener" #14 daemon prio=9 os_prio=31 tid=0x00007fedd4897000 nid=0x5903 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"DestroyJavaVM" #13 prio=5 os_prio=31 tid=0x00007fedd0994800 nid=0x1003 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"pool-1-thread-2" #12 prio=5 os_prio=31 tid=0x00007fedd0993800 nid=0xa903 waiting on condition [0x000070000be83000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076ad4abe0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at com.mooc.study.thread.DeadLockTest.lambda$main$1(DeadLockTest.java:35)
    at com.mooc.study.thread.DeadLockTest$$Lambda$2/1598924227.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-1" #11 prio=5 os_prio=31 tid=0x00007fedd0993000 nid=0x5703 waiting on condition [0x000070000bd80000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076ad4ac10> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at com.mooc.study.thread.DeadLockTest.lambda$main$0(DeadLockTest.java:20)
    at com.mooc.study.thread.DeadLockTest$$Lambda$1/1555093762.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"Service Thread" #10 daemon prio=9 os_prio=31 tid=0x00007fedd4871800 nid=0x3803 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread3" #9 daemon prio=9 os_prio=31 tid=0x00007fedcf8f8800 nid=0x3703 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread2" #8 daemon prio=9 os_prio=31 tid=0x00007fedd501d800 nid=0x3b03 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #7 daemon prio=9 os_prio=31 tid=0x00007fedd18d5800 nid=0x3503 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #6 daemon prio=9 os_prio=31 tid=0x00007fedd18d5000 nid=0x3e03 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Monitor Ctrl-Break" #5 daemon prio=5 os_prio=31 tid=0x00007fedd00a9000 nid=0x3403 runnable [0x000070000b66b000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
    - locked <0x000000076adcfc30> (a java.io.InputStreamReader)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:161)
    at java.io.BufferedReader.readLine(BufferedReader.java:324)
    - locked <0x000000076adcfc30> (a java.io.InputStreamReader)
    at java.io.BufferedReader.readLine(BufferedReader.java:389)
    at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:47)

"Signal Dispatcher" #4 daemon prio=9 os_prio=31 tid=0x00007fedd0822000 nid=0x4103 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=31 tid=0x00007fedd0808800 nid=0x2f03 in Object.wait() [0x000070000b35f000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x000000076ab08ed0> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
    - locked <0x000000076ab08ed0> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)

"Reference Handler" #2 daemon prio=10 os_prio=31 tid=0x00007fedd2017800 nid=0x4c03 in Object.wait() [0x000070000b25c000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x000000076ab06bf8> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:502)
    at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
    - locked <0x000000076ab06bf8> (a java.lang.ref.Reference$Lock)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"VM Thread" os_prio=31 tid=0x00007fedd003d800 nid=0x4e03 runnable 

"GC task thread#0 (ParallelGC)" os_prio=31 tid=0x00007fedd000a800 nid=0x2307 runnable 

"GC task thread#1 (ParallelGC)" os_prio=31 tid=0x00007fedd000b800 nid=0x2203 runnable 

"GC task thread#2 (ParallelGC)" os_prio=31 tid=0x00007fedd000c000 nid=0x2003 runnable 

"GC task thread#3 (ParallelGC)" os_prio=31 tid=0x00007fedd000c800 nid=0x2a03 runnable 

"GC task thread#4 (ParallelGC)" os_prio=31 tid=0x00007fedd000d000 nid=0x5303 runnable 

"GC task thread#5 (ParallelGC)" os_prio=31 tid=0x00007fedd000e000 nid=0x5203 runnable 

"GC task thread#6 (ParallelGC)" os_prio=31 tid=0x00007fedd000e800 nid=0x5003 runnable 

"GC task thread#7 (ParallelGC)" os_prio=31 tid=0x00007fedd000f000 nid=0x2d03 runnable 

"VM Periodic Task Thread" os_prio=31 tid=0x00007fedd5046000 nid=0x5503 waiting on condition 

JNI global references: 319


Found one Java-level deadlock:
=============================
"pool-1-thread-2":
  waiting for ownable synchronizer 0x000000076ad4abe0, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "pool-1-thread-1"
"pool-1-thread-1":
  waiting for ownable synchronizer 0x000000076ad4ac10, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "pool-1-thread-2"

Java stack information for the threads listed above:
===================================================
"pool-1-thread-2":
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076ad4abe0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at com.mooc.study.thread.DeadLockTest.lambda$main$1(DeadLockTest.java:35)
    at com.mooc.study.thread.DeadLockTest$$Lambda$2/1598924227.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-1":
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076ad4ac10> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at com.mooc.study.thread.DeadLockTest.lambda$main$0(DeadLockTest.java:20)
    at com.mooc.study.thread.DeadLockTest$$Lambda$1/1555093762.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Found 1 deadlock.

JDK1.8之后,通过jstack命令可以输出死锁信息,以Found one Java-level deadlock, 可以很清晰的看到死锁是发生在哪几个线程中,每个锁又是被哪个线程持有。比如上面堆栈中,pool-1-thread-2 在等待0x000000076ad4abe0锁,需要等待pool-1-thread-1释放,而pool-1-thread-1在等待0x000000076ad4ac10,需要等待pool-1-thread-2释放。

3. 通过Arthas工具定位死锁

Arthas是一个优秀的Java进程定位神器,所有学习java的工程师都需要熟悉,官网是https://arthas.aliyun.com/en-us/。本小节只是抛砖引玉,演示死锁的排查,更多功能大家可以跟着官网进行学习。

按照官网的步骤我们先进行下载和安装

下载安装: curl -L https://arthas.aliyun.com/install.sh | sh

启动选择一个进程: ./as.sh

通过执行thread -b 就能定位目前出现的死锁

[arthas@18514]$ thread -b
"pool-1-thread-1" Id=11 WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@36a0bc56 owned by "pool-1-thread-2" Id=12
    at sun.misc.Unsafe.park(Native Method)
    -  waiting on java.util.concurrent.locks.ReentrantLock$NonfairSync@36a0bc56
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at com.mooc.study.thread.DeadLockTest.lambda$main$0(DeadLockTest.java:20)
    at com.mooc.study.thread.DeadLockTest$$Lambda$1/1555093762.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    Number of locked synchronizers = 2
    - java.util.concurrent.locks.ReentrantLock$NonfairSync@159ec35f <---- but blocks 1 other threads!
    - java.util.concurrent.ThreadPoolExecutor$Worker@5ef04b5

4. 通过Jvisualvm 定位死锁

Jvisualvm 是一种自带的可视化工具,往往在在本地执行。

通过Jvisualvm命令打开软件,选中进程,进入线程视图,会给出死锁提示: 图片描述

5. 总结

目前死锁检测的工具和手段已经相当丰富了,JDK1.6时代还只能通过线程dump来定位,本节介绍的几种工具,不仅仅可以检测死锁,线程运行状态也可以得到很全面的数据,尤其是原生的Jdk命令,Arthas作为java检测工具的优秀代表,也需要认真学习的。

预览图
评论区

索引目录