问题描述
long story short, some coworkers are running a pretty old setup(oc4j jdk1.5.6 in x86_64) with an application which happens to be mission critical. they recently have tried to deploy a new version of the application, but as soon as they do the java process(es) throw a core dump and die.
the problem is, the core dumps seem to be fine, gdb can open them, but jmap and other tools refuse to process them:
# /usr/java/jdk1.5.0_06/bin/jmap /usr/java/jdk1.5.0_06/bin/java core attaching to core core from executable /usr/java/jdk1.5.0_06/bin/java, please wait... error attaching to core file: can't attach to the core file
and newer versions throw a exception:
# jdk1.6.0_45/bin/jmap /usr/java/jdk1.5.0_06/bin/java core attaching to core core from executable /usr/java/jdk1.5.0_06/bin/java, please wait... exception in thread "main" java.lang.reflect.invocationtargetexception at sun.reflect.nativemethodaccessorimpl.invoke0(native method) at sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:39) at sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:25) at java.lang.reflect.method.invoke(method.java:597) at sun.tools.jmap.jmap.runtool(jmap.java:179) at sun.tools.jmap.jmap.main(jmap.java:110) caused by: sun.jvm.hotspot.runtime.vmversionmismatchexception: supported versions are 20.45-b01. target vm is 1.5.0_06-b05 at sun.jvm.hotspot.runtime.vm.checkvmversion(vm.java:224) at sun.jvm.hotspot.runtime.vm.(vm.java:287) at sun.jvm.hotspot.runtime.vm.initialize(vm.java:357) at sun.jvm.hotspot.bugspot.bugspotagent.setupvm(bugspotagent.java:594) at sun.jvm.hotspot.bugspot.bugspotagent.go(bugspotagent.java:494) at sun.jvm.hotspot.bugspot.bugspotagent.attach(bugspotagent.java:348) at sun.jvm.hotspot.tools.tool.start(tool.java:169) at sun.jvm.hotspot.tools.pmap.main(pmap.java:67) ... 6 more
gdb offers little information without symbols:
reading symbols from /usr/java/jdk1.5.0_06/bin/java...(no debugging symbols found)...done. [new thread 9841] [new thread 31442] [new thread 31441] ... core was generated by `/usr/java/jdk1.5.0_06/bin/java -server -xx: useconcmarksweepgc -xx:maxheapfreer'. program terminated with signal 6, aborted. #0 0x0000003bbf030285 in ?? () (gdb) bt #0 0x0000003bbf030285 in ?? () #1 0x0000003bbf031d30 in ?? () #2 0x0000000000000000 in ?? ()
the only valuable information i've gathered from the core is that most threads are blocked(i'm far from being a gdb guru):
35 thread 10093 0x0000003bbfc0b1c0 in pthread_cond_timedwait@@glibc_2.3.2 () from /lib64/libpthread.so.0 34 thread 10097 0x0000003bbfc0b1c0 in pthread_cond_timedwait@@glibc_2.3.2 () from /lib64/libpthread.so.0 33 thread 10099 0x0000003bbfc0b1c0 in pthread_cond_timedwait@@glibc_2.3.2 () from /lib64/libpthread.so.0
besides, i don't know if it's really relevant. the app is almost always heavily loaded, and my bet is that there were some lock contention already but since it's another's team app my knowledge about it it's pretty shallow.
i guess this is a long shot, but is there something that we can do to get a java thread dump or something like that? do sun used to offer debuginfo of the jdk as i guess is avalaible now with openjdk?
thanks in advance.
update: the other team has resolved the issue without getting info from the core dump, just by trial and error after successfully replicating the problem in a test system. i'm still intrigued about the thing: how to debug an ancient java core dump which jmap can't process, it might be valuable info for the future, althought it seems is that there is no solution to that problem. probably the jvm memory got corrupted and that's why jmap can't process it.
you can add the following jvm option when starting your application, that will allow you to run any command you specify if a fatal jvm error occurs:
-xx:onerror=""
for instance, you could run a command (or a script) that will perform certain actions like get a heap or thread dump.