Summary: | etc/test.bat causes HotSpot crash on b271 and later (Win XP 32-bit, ATI) | ||
---|---|---|---|
Product: | [JogAmp] Jogl | Reporter: | Wade Walker <wwalker3> |
Component: | opengl | Assignee: | Sven Gothel <sgothel> |
Status: | VERIFIED FIXED | ||
Severity: | enhancement | CC: | carlo.salinari, odimond |
Priority: | --- | ||
Version: | 2 | ||
Hardware: | All | ||
OS: | all | ||
Type: | --- | SCM Refs: |
ab93183b90e83b9aebc29031c7b88b9a3dc58ff5
|
Workaround: | --- | ||
Attachments: |
etc\test.bat log file
HotSpot error file etc\test.bat log file, this time with debug info and proper MIME type HotSpot error file with proper MIME type HotSpot error file for OneTriangleAWT test case from tutorials page |
Created attachment 235 [details]
HotSpot error file
This is from Oracle JDK 1.6.0_u24
The system is Win XP Pro SP3, 32-bit. The graphics card is ATI Mobility Radeon x300 at driver version 8.162.0.0 (8/3/2005), which is the latest version. Created attachment 236 [details]
etc\test.bat log file, this time with debug info and proper MIME type
Created attachment 237 [details]
HotSpot error file with proper MIME type
I noticed last night that when you turn on debug flags, the HotSpot crash goes away, and there's just an NPE in the debug log. This bug was caused by commit 8adc04788a6d9dd44de5a4636b46d14dbb70b799 (GLCapabilities enhancements: Choosing, All-Available, Data Handling (X11, WGL and EGL)). It's a huge commit that affects 42 files, so it may be difficult for me to find the cause by looking at the commit diffs :) I check the commit right before it, and that one was fine. Commit 8adc04788a6d9dd44de5a4636b46d14dbb70b799 actually freezes, then when I apply the next commit (Fix WindowsDummyWGLDrawable: onscreen && !pbuffer, a one-line fix) I see the HotSpot crash. This looks like a threading problem. When JOGL calls a native WGL function like wglGetPixelFormatAttribivARB, it has the correct address, and calls the function properly. But the first thing the function in the ATI DLL does is try to get another function pointer from a jump table like this: 69346b70: mov %fs:0xbf0,%eax 69346b76: mov 0x431f8(%eax),%eax 69346b7c: jmp *0x758(%eax) %fs:0xbf0 is an offset inside the Thread Information Block (TIB). The range from 0x714 to 0xbf4 is reserved for GL. After the first instruction, %eax is zero, which implies that GL isn't set up properly on this thread (otherwise, there would be some memory address stored in %fs:0xbf0). I'll keep looking to try to find the root cause. It turns out the problem is because a GL context is not current on the thread. Apparently wglMakeCurrent() is what sets the data into the reserved GL area of the Thread Information Block, which other GL functions rely on later. Having this data be zero is what causes the crash. When I insert a call to GLProfile.initSingleton( false ) and turn on debugging flags, it makes a GL context current on the thread as a side effect, and this crash goes away. This problem also shows up in AWT programs that call awt.GLCanvas.chooseGraphicsConfiguration() indirectly as a result of calling frame.setVisible( true ). I'll attach a stack trace for that error too. Created attachment 242 [details]
HotSpot error file for OneTriangleAWT test case from tutorials page
Same error, different path down to a WGL function with no GL context set on the main thread.
*** Bug 469 has been marked as a duplicate of this bug. *** commit ab93183b90e83b9aebc29031c7b88b9a3dc58ff5 Author: Sven Gothel <sgothel@jausoft.com> Date: Mon Mar 21 07:13:45 2011 +0100 Fix Bug #480 (attempt) - ATI + WinXP: make context current for ARB PFD queries/selection TODO: Validate if bug is actually relates to the 'old' ATI Windows driver for old GPU's like X300 etc and unrelated to the actual Windows version ! Also ensure that the no pixelformat is being set on external context/HDC. http://jogamp.org/git/?p=jogl.git;a=commit;h=ab93183b90e83b9aebc29031c7b88b9a3dc58ff5 Thanks to Wade for allowing me to use his machine - plus his bug triage! Maybe we need to verify is this bug happens on other windows version > WinXP-32bit and/or is related to a specific range of ATI Windows drivers/GPUs! I've just revived an old dell with an ATI X600 (RV370). Does it qualify to verify the fix? How should I proceed? I think your old system should be great to verify this fix. All you need to do is download the latest development build from http://jogamp.org/deployment/autobuilds/master/ (check with the build server at https://jogamp.org/chuck/ to make sure Sven's change went in). Once you download the build, unzip it, cd into the dir above etc, jar, and lib, then type etc\test.bat and see if you get a crash :) Also, please check the output of test.bat (it goes into a log file as well as to the stdout) to make sure it reports your card correctly (I've seen it report Microsoft's GL driver sometimes, but haven't had time to recheck after this fix yet). Verified, works fine under both Win XP 32 and Win 7 32. GL_VENDOR is ok. I was seeing Microsoft as well, because I run the test before installing the ATI drivers, and the test output is appended to the log file. Maybe the same happened to you. |
Created attachment 234 [details] etc\test.bat log file All JOGL builds starting at b271 cause a HotSpot crash when I run etc\test.bat. I get similar results when trying to run unit tests.