Summary: | JVM crashes when System.exit(0) is called w/ Mesa X11 SW Driver < 8.0 | ||
---|---|---|---|
Product: | [JogAmp] Jogl | Reporter: | Martin Hegedus <martin.hegedus> |
Component: | core | Assignee: | Sven Gothel <sgothel> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | gouessej, sgothel |
Priority: | P3 | ||
Version: | 2 | ||
Hardware: | pc_x86_32 | ||
OS: | linux | ||
Type: | --- | SCM Refs: |
dc8d3bbbba7910b3e2453a0d7d37d2bf8a05f4b6 (Martin's native test 1)
4d7c527756f0410f31915e86815732ba912c1bee (Martin's native test 2)
5b47372590ec715647ebbd75d70c41ec7a64485a (Martin's fix)
ff25c711fe4fa627004c57e0d308d06759156290 (cleanup)
cea002636c4c8e1cf41288662fffc26bd24f90ac ..
94d3960757db292eeefb7acdfa1220b6b7168b98
|
Workaround: | TRUE | ||
Attachments: |
SimpleTest.java
test ouput JVM error log for test.sh crash test_dbg.sh output JVM error log for test_dbg.sh crash output (std and err) from SimpleTest JVM error log for SimpleTest crash output from SimpleTest with JOGL 2.0-bmanual-20111124 Version 2 of SimplTest modified to create only one window displayMultiple using glXCreateNewContext patch for xdisplayclose mods |
Ooops, hit return while still adding detail. :( Sorry about that. Still adding more Created attachment 424 [details]
test ouput
Created attachment 425 [details]
JVM error log for test.sh crash
Created attachment 426 [details]
test_dbg.sh output
Created attachment 427 [details]
JVM error log for test_dbg.sh crash
Created attachment 428 [details]
output (std and err) from SimpleTest
Created attachment 429 [details]
JVM error log for SimpleTest crash
Created attachment 430 [details]
output from SimpleTest with JOGL 2.0-bmanual-20111124
OK, finished adding attachments. Let me know if you have any questions. I retrieved the latest version using git clone git://jogamp.org/srv/scm/gluegen.git gluegen git clone git://jogamp.org/srv/scm/jogl.git jogl and compiled it with ant. It seems to have compiled successfully, however various programs running under the test program junit.run also cause the JVM to crash. SimpleTest still has the same behavior as when running under the latest aggregated autobuild. If I comment out e.getWindow().dispose() then the JVM will not crash when System.exit(0) is called. jogamp@jogamp05:~/projects/JOGL/jogl-bugs/705-JVMCrash$ java SimpleTest #SimpleTest: Calling: GLProfile.getDefault() libEGL warning: unsupported platform Windows libEGL warning: unsupported platform Windows #SimpleTest: Calling: new GLCapabilities(glp) #SimpleTest: Calling: new GLCanvas(caps) #SimpleTest: Calling: new Frame("Test") #SimpleTest: Calling: frame.setSize(300, 300) #SimpleTest: Calling: addWindowListener(new WindowAdapter() #SimpleTest: Calling: canvas.addGLEventListener(new SimpleTest()) #SimpleTest: Calling: frame.add(canvas) #SimpleTest: Calling: frame.setVisible(true) #SimpleTest: Calling: e.getWindow().dispose() #SimpleTest: Calling: System.exit(0) Tested on Linux w/ NV and Mesa driver and latest aggregated build. GL_RENDERER = Mesa DRI Intel(R) Sandybridge Desktop GL_VERSION = 3.0 Mesa 9.0 GL_VENDOR = Intel Open Source Technology Center No crash. Note that AWT windowClosing() on a Frame may only need frame.dispose() being called. However, Julien has experienced a System.exit() might be required for JNLP Webstart - I don't know. We test this w/ 'TestAWT02WindowClosing' in our unit tests. Regardless whether System.exit() is recommended to be called or not - a crash shall not happen. Maybe I need to run the test w/ an older Mesa driver. Ok, tested on a vbox debian7: GL_RENDERER = Gallium 0.4 on llvmpipe (LLVM 0x209) GL_VERSION = 2.1 Mesa 8.0.5 GL_VENDOR = VMware, Inc. No crash w/ tip d9d0abc0974f42dbc1c7c91bc004b92729b8ae8a http://jogamp.org/deployment/archive/master/gluegen_652-joal_413-jogl_936-jocl_761/ Please retry .. it's odd that the test behave so different. -> normal, since not reproducible here I've test this some more and will shortly upload the latest SimpleTest program. I have still not been able to resolve this. However, here are some things I've noticed. 1) The error is occurring the second time glXDestroyContext is called. Therefore if two frames are created and disposed of, the error will occur on the second call to frame.dispose(). 2) Shared resource uses the SharedResourceRunner thread to access the glX routines. This may cause a problem if all gl calls should be on the AWT-EventQueue thread. 3) GLProfile.getDefault() and GLProfile.shutdown() can not be called from the AWT thread if the glx calls are being forced onto the AWT thread because it will block. The thread chain would be AWT-EventQueue-0 -> SharedResourceRunner -> AWT-EventQueue-0 4) If a Threading class method is called before GLProfile.getDefault() then a working thread is created for the gl calls instead of using the awt thread. This is because the static routine in ThreadingImpl calls GLProfile.isAWTAvailable(). This is not necessarily a bug, just something to be aware of. 5) For my system, it seems that calling glProfile.shutdown() before System.exit(0) works the best. For some reason if glProfile.shutdown() is called by the system exit hook then things occasionally hang. 6) If I create and destroy two frames rather than one, then both the older and never version of jogl crash the JVM. 7) If I don't allow GLCanvas's display method to be called then the program terminates OK. It seems the act of calling makeCurrent and/or release inside GLDrawableHelper.invokeGLImpl sets up the issue for glXDestoryContext to crash. I gather that is because makeCurrent creates the context the first time it is called. I'll try digging around some more and updating Mesa. Created attachment 434 [details]
Version 2 of SimplTest
This version will crash the older and newer versions of jogl because two frames are being created and disposed.
I was unable to dedicate the time and risk to updating my Mesa since the OS for the computer is too old. So I dug around some more in jogl Overall, the error revolves around using XCloseDisplay and glXDestroyContext in conjunction. If I set markAllDisplaysUnclosable to true in X11Util then the JVM does not crash. Or, if glXDestroyContext is not called then the JVM does not crash. Looking at the order these methods are called, things seem OK. In other words, glXDestroyContext is always called before XCloseDisplay. And they are always done on the same thread. Except to say that the shared resource is working on the main-SharedResourceRunner thread and frames are working on the AWT-EventQueue-0 thread. Any thoughts on how to proceed? Is there a simple test code which tests this? Is there a recommended way of setting markAllDisplaysUnclosable to true? I looked at the TestAWT02WindowClosing program, but it does not create and destroy contexts. Oh, I do want to mention that I did compile displayMultiple02.c and it seems to execute successfully. (In reply to comment #17) > Oh, I do want to mention that I did compile displayMultiple02.c and it seems > to execute successfully. I see - you followed my path of horrors of buggy GLX impl. regarding to multithreading :) So: GL_VENDOR Brian Paul GL_RENDERER Mesa X11 GL_VERSION 2.1 Mesa 7.0 is buggy like AMD's GLX hack - you claim ? Since you are already pretty familiar w/ our stuff, maybe you like to add such quirk into GLRendererQuirks ? Maybe 'DontCloseX11DisplayConnection' to be enabled w/ AMD as detected in X11Util and Mesa 7.0 ? Now we would need to know (or guess) the Mesa version range for this. Yes, this comes pretty late (X11 display connection already created), however .. hopefully its soon enough at least when kicked off by the shared-resource-runner who queries all profiles. Since this seems to be a bug or related to such an old Mesa version, I would like to keep this shortcut .. short :) .. i.e. not loosing too much time on it. Can you do it ? There seems to be more to this. I changed displayMultiple02 so it uses glXCreateNewContext instead of glXCreateContext and it now crashes. So the connection seems involve XOpenDisplay, XCloseDisplay, glxCreateNewContext, glxMakeCurrent, and glXDestroyContext. Then I took the second XOpenDisplay out or the original displayMultiple02 (i.e. I only open one window in displayMultiple02's testOrder) and I get the following X11 error the second time testOrder is called: X Error of failed request: BadRequest (invalid request code or no such operation) Major opcode of failed request: 0 () Serial number of failed request: 15 Current serial number in output stream: 15 Is it confirmed that the following can be done? disp1 = XOpenDisplay(NULL); ...do stuff.. XCloseDisplay(disp1); disp1 = XOpenDisplay(NULL); ...do stuff.. XCloseDisplay(disp1); I'll upload my modified version of displaymultiple02 Created attachment 446 [details]
modified to create only one window
Modified so testOrder creates only one window. This generates the following output
-xlock (XLockDisplay): 0
Normal order: Create #1
Normal order: Destroy #1.0
Normal order: Destroy #1.X
Normal order: Success - no bug
Reverse order: Create #1
X Error of failed request: BadRequest (invalid request code or no such operation)
Major opcode of failed request: 0 ()
Serial number of failed request: 15
Current serial number in output stream: 15
Created attachment 447 [details]
displayMultiple using glXCreateNewContext
This is the version of displayMultiple02 which uses glXCreateNewContext instead of glXCreateContext. The output is ass follows:
-xlock (XLockDisplay): 0
Normal order: Create #1
Normal order: Create #2
Normal order: Destroy #1.0
Normal order: Destroy #1.X
Normal order: Destroy #2.0
Normal order: Destroy #2.1
Normal order: Destroy #2.2
Segmentation fault
Just added my version of displayMonitor02 which uses glXCreateNewContext. In summary displayMonitor02 which uses glXCreateContext for 2X2 windows works. displayMonitor02_mch which uses glXCreateContext for 2x1 window gives X11 error displayMonitor02_new_mch which uses glXCreateNewContext for 2x2 windows seg faults. If I take out the calls to glXCreateContext, glXMakeCurrent, and glXDestroyContext in displayMonitor02_mch, it works. I assume the same is true for displayMonitor02_new_mch. If I comment out XCloseDisplay in any of the above programs, the program executes successfully. In regards to JOGL. Should I put in a DontCloseX11DisplayConnection that can be activated at the command line with a -D...? I'm not sure if the error experienced here is a X11, Mesa, driver, or whatever issue. So maybe a catch all is appropriate. (In reply to comment #22) > > Should I put in a DontCloseX11DisplayConnection that can be activated at the > command line with a -D...? I'm not sure if the error experienced here is a > X11, Mesa, driver, or whatever issue. So maybe a catch all is appropriate. Any updates about your patch, Martin ? I thought you wanted to offer a git patch or pull request ? Sven, sent you an email about that two days ago, after your suggestion about unified diff. I wanted to try once more to push the changes onto a branch using git and got an error. Didn't hear back from you so thought you were busy. If this git push thing doesn't work I'll upload a unified git diff file. Here's what I've done in regards to trying to push changes to git >git clone git://jogamp.org/srv/scm/jogl.git jogl >cd jogl >git branch mesa7_quirk_closedpy >git checkout mesa7_quirk_closedpy ...modify files... >git config --global user.email "martin.hegedus@gmail.com" >git commit -a -m "XDisplayClose mod" [mesa7_quirk_closedpy eb948fa] XDisplayClose mod 4 files changed, 121 insertions(+), 6 deletions(-) >git branch master * mesa7_quirk_closedpy >git remote -v origin git://jogamp.org/srv/scm/jogl.git (fetch) origin git://jogamp.org/srv/scm/jogl.git (push) >git push origin mesa7_quirk_closedpy fatal: The remote end hung up unexpectedly Created attachment 452 [details]
patch for xdisplayclose mods
Patch was created with:
git format-patch master
I made changes to GLContextImpl.java, GLProfile.java, X11Util.java, GLRendererQuirks.java in regards to adding a DontCloseX11DisplayConnection quirk and a nativewindow.debug.X11Util.HasX11CloseDisplayBug debug property. The quirk is a little hacky since I had to add it after the calls to initProfilesForDevice(defaultEGLDevice) and initProfilesForDevice(defaultDesktopDevice) in GLProfile since GLRendererQuirks were not set up till after that. I wanted to add it before the first call to XOpenDisplay but could not figure out how to access glGetString without going through an initialized GLContext. Also, the workaround can be broken if the shared resource does not use the Mesa X11 driver but later a display is opened with the X11 driver. I assume this could happen if a remote display is chosen. Further coding, if the need arises, may be required.
(In reply to comment #25) > Created attachment 452 [details] > patch for xdisplayclose mods > > Patch was created with: > git format-patch master > > I made changes to GLContextImpl.java, GLProfile.java, X11Util.java, > GLRendererQuirks.java in regards to adding a DontCloseX11DisplayConnection > quirk and a nativewindow.debug.X11Util.HasX11CloseDisplayBug debug property. > The quirk is a little hacky since I had to add it after the calls to > initProfilesForDevice(defaultEGLDevice) and > initProfilesForDevice(defaultDesktopDevice) in GLProfile since > GLRendererQuirks were not set up till after that. I wanted to add it before > the first call to XOpenDisplay but could not figure out how to access > glGetString without going through an initialized GLContext. Also, the > workaround can be broken if the shared resource does not use the Mesa X11 > driver but later a display is opened with the X11 driver. I assume this > could happen if a remote display is chosen. Further coding, if the need > arises, may be required. Sorry, didn't get the patch file via email - bit now it's here and I will review later. Thank you! I guess we can move the 'early' quirk detection from GLProfile to the X11 SharedResourceImpl, which will free GLProfile from such quirks .. Other than that .. I will patch-in your changes: - w/o date+name in source code (this belongs to a git commit) - checking whether we can reuse other 'detection' code I will put your name in the git commit message as orig. author, if that is OK w/ you. I will also try to commit w/ you as author .. and clean up afterwards .. lets see. Good job, hopefully I can do that tomorrow/Monday. Thank you. Cool and thank you. I am glad I was able to contribute. Tested on a vbox instance w/ [kubuntu 8.04.1] and confirmed your reported behavior: Works: Mesa glx/dri drivers! GL_VENDOR Mesa project: www.mesa3d.org GL_RENDERER Mesa GLX Indirect GL_VERSION 1.4 (2.1 Mesa 7.0.3-rc2) Crash: Mesa swx11 drivers! GL_VENDOR Brian Paul GL_RENDERER Mesa X11 GL_VERSION 2.1 Mesa 7.0.3-rc2 .. see git logs Note: Validated that Quirk DontCloseX11Display is not required w/ Mesa >= 8.0 Tested w/ kubuntu 12.04, Mesa 8.0.4 (glx/dri/gallium and X11 SW rasterizer) |
Created attachment 423 [details] SimpleTest.java The attached SimpleTest program crashes the JVM when System.exit(0) is called. The programs test.sh and test_dbg.sh also crash the JVM. /deployment/archive/master/gluegen_646-joal_408-jogl_930-jocl_756/archive/jogamp-all-platforms.7z was used. An older version of jogl, JOGL implementation version 2.0-bmanual-20111124, did not crash JVM. (Note, I compiled the older version [2.0-bmanual-20111124] locally but used the provided build for the new one [2.0-b646-20130313].) Attachments: 1) SimpleTest.java