Java线程状态dump是重要的Java应用程序分析工具。它可以帮助我们找到导致诸如高内存/CPU占用、死锁等性能问题的根源。
得益于Java平台丰富的工具,产生Java线程状态dump的方法也很多。比如VisualVM,但这里着重介绍命令行的工具,因为它们是Java自带的,掌握这些工具在生产环境诊断问题更加得心应手。
命令行产生dump主要有两种方法:
- 向Java进程发SIGQUIT信号kill -3 <java_process_id>
- jstack <java_process_id>
- jcmd <java_process_id | java_class_name> Thread.print
真正应该使用的是后一种,因为前一种会往目标Java进程所在终端上直接输出dump内容,这样无法重定向dump内容到文件。而且如果目标Java进程如果事后台运行的,那么就很难找到dump内容。jcmd不仅可以用PID识别目标Java进程,而且可以用Java进程的启动类名识别目标进程。所以,从脚本编程的角度,用jcmd是最佳选择。
以下是个用jstack分析死锁的例子。产生死锁的程序如下:
import static java.lang.System.out; public class Deadlock { static class Resource { private String name; private Resource buddy; public Resource(String name) { this.name = name; } public synchronized void acquire(long delay) { out.printf( "Thread %s acquired %s, trying to get %s ...\n", Thread.currentThread().getName(), this.name, this.buddy.getName() ); try { Thread.sleep(delay); } catch (InterruptedException e) { e.printStackTrace(); } this.buddy.hold(); } public synchronized void hold() { out.printf( "Thread %s hold %s\n", Thread.currentThread().getName(), this.name ); } public void setBuddy(Resource buddy) { this.buddy = buddy; } public String getName() { return this.name; } } static class CompetitorThread extends Thread { private Resource resource; private long delay; public CompetitorThread(Resource resource, long delay) { this.resource = resource; this.delay = delay; } @Override public void run() { resource.acquire(delay); out.printf( "Thread %s completes execution!\n", Thread.currentThread().getName() ); } } public static void main(String[] args) { Resource resourceA = new Resource("Resource A"); Resource resourceB = new Resource("Resource B"); resourceA.setBuddy(resourceB); resourceB.setBuddy(resourceA); Thread thread1 = new CompetitorThread(resourceA, 0); Thread thread2 = new CompetitorThread(resourceB, 1000); thread1.start(); /* uncomment the following code to avoid deadlock */ // try { // Thread.sleep(1000); // } // catch (InterruptedException e) { // e.printStackTrace(); // } thread2.start(); } }
你也可以在这里下载这个程序:https://github.com/schnell18/java-honing/blob/master/src/main/java/org/home/hone/thread/Deadlock.java
然后编译并运行这个程序:
$ javac Deadlock.java $ java Deadlock Thread Thread-0 acquired Resource A, trying to get Resource B ... Thread Thread-1 acquired Resource B, trying to get Resource A ...
打开另一个终端执行:
$ ps -ef |grep Deadlock fgz 6076 5664 0 22:11 pts/0 00:00:00 java Deadlock fgz 6144 5879 0 22:38 pts/1 00:00:00 grep Deadlock
可见陷入死锁的Java进程的PID时6076。然后就可以用jstack产生线程dump了:
$ jstack 6076 > deadlock.dump
deadlock.dump的摘要如下:
2015-02-27 22:42:23 Full thread dump OpenJDK 64-Bit Server VM (23.25-b01 mixed mode): ... Java stack information for the threads listed above: =================================================== "Thread-1": at Deadlock$Resource.hold(Deadlock.java:29) - waiting to lock <0x00000000f224cb28> (a Deadlock$Resource) at Deadlock$Resource.acquire(Deadlock.java:25) - locked <0x00000000f224cb88> (a Deadlock$Resource) at Deadlock$CompetitorThread.run(Deadlock.java:56) "Thread-0": at Deadlock$Resource.hold(Deadlock.java:29) - waiting to lock <0x00000000f224cb88> (a Deadlock$Resource) at Deadlock$Resource.acquire(Deadlock.java:25) - locked <0x00000000f224cb28> (a Deadlock$Resource) at Deadlock$CompetitorThread.run(Deadlock.java:56) Found 1 deadlock.
这里清除的说明了死锁的存在,并说明了Thread-0和Thread-1都阻塞在hold()这个方法调用上。具体的代码行号是Deadlock.java 29行。这些具体的信息为问题的解决打下了坚实的基础。