Created attachment 375 [details] test.sh output plus standalone SWT app that demos the issue Fedora 12 (2.6.32.26-175.fc12.i686.PAE) Intel/32 JRE/JDK 7u7 GeForce 9800 GT + NVIDIA driver 304.37 JOGAMP/JOGL custom build from 10/09/2012 repos (RC11 in progress - is there a better/preferred way to report this?) test.sh output is included in the attached zip file. The version of SWT in use is the one that came with the Eclipse 4 "Juno" installation. The relevant jar is named: org.eclipse.swt.gtk.linux.x86_3.100.1.v4234e.jar Not sure how this maps to the standalone SWT versions (e.g. 3.8, 4.2, etc). There is a trail of prior discussion leading up to this bug report at: http://forum.jogamp.org/Reshape-Problem-Using-NewtCanvasSWT-tp4026385.html This problem was originally observed in an Eclipse 4 RCP app. If a NewtCanvasSWT and associated GLWindow are constructed and placed as the view associated with a "part" (essentially a RCP UI tab) everything will come up and appear to work. Initially we had some event delivery issues as discussed in the forum topic noted above, but those were fixed quickly by Sven. However, one issue remains: if the corner of the window is grabbed with the mouse and moved around rapidly such that many reshapes of the NewtCanvasSWT are stimulated, eventually the UI will freeze up and ~5 seconds later a locking exception will be thrown. An example backtrace is included in the forum discussion above and repeated here for convenience: Exception in thread "main-SWTDisplay-.x11_:0.0-1-EDT-1" java.lang.RuntimeException: Waited 5000ms for: <14e5b0e, 13c0223>[count 1, qsz 0, owner <main>] - <main-SWTDisplay-.x11_:0.0-1-EDT-1> at jogamp.common.util.locks.RecursiveLockImpl01Unfairish.lock(RecursiveLockImpl01Unfairish.java:197) at com.jogamp.newt.opengl.GLWindow.display(GLWindow.java:539) at jogamp.opengl.GLAutoDrawableBase.defaultWindowResizedOp(GLAutoDrawableBase.java:128) at com.jogamp.newt.opengl.GLWindow.access$100(GLWindow.java:94) at com.jogamp.newt.opengl.GLWindow$1.windowResized(GLWindow.java:112) at jogamp.newt.WindowImpl.consumeWindowEvent(WindowImpl.java:2344) at jogamp.newt.WindowImpl.sendWindowEvent(WindowImpl.java:2287) at jogamp.newt.WindowImpl.sizeChanged(WindowImpl.java:2431) at jogamp.newt.driver.x11.DisplayDriver.DispatchMessages0(Native Method) at jogamp.newt.driver.x11.DisplayDriver.dispatchMessagesNative(DisplayDriver.java:106) at jogamp.newt.DisplayImpl.dispatchMessages(DisplayImpl.java:442) at com.jogamp.newt.swt.SWTEDTUtil$1.run(SWTEDTUtil.java:57) at com.jogamp.newt.swt.SWTEDTUtil$NewtEventDispatchThread.run(SWTEDTUtil.java:239) This problem also happens in a standalone SWT application, an example of which is in the attached zip file. I didn't include anything to build or run this app since your jar locations will be different. The problem is harder to duplicate in the standalone SWT app than in the Eclipse RCP app. When using the mouse to drag the corner of an Eclipse RCP window the lock-up can be made to happen typically within 5->10 seconds. In the standalone SWT app it sometimes takes a minute or so of dragging before the lockup occurs. To avoid wrist and finger fatigue the app has a "Start" button that will do the resizes from code. I was not able to come up with a way to make this problem happen in a deterministic way that could be included in a regular test case. The presentation is typical of concurrency issues (i.e. races, etc) where sometimes it fails quickly, and sometimes not. It is the worst kind of software issue since you can never be quite sure when you have it fixed unless the cause is clear.
On second thought, should this be filed as a NEWT bug (?).
(In reply to comment #1) > On second thought, should this be filed as a NEWT bug (?). Yes, done :) Any progress ? I had not time yet to test this issue, maybe next weeks.
This one I have not looked at. It has the feel of something that you could fix a lot quicker than you could field the pile of questions I'd probably have about your exclusive access policies as I tried to figure out what's going on. Regardless, if time permits at some point soon then I'll take a closer look. If I don't get anywhere then we're no worse off.
(In reply to comment #3) > This one I have not looked at. It has the feel of something that you could fix > a lot quicker than you could field the pile of questions I'd probably have > about your exclusive access policies as I tried to figure out what's going on. Whats 'exclusive access policies' ? > > Regardless, if time permits at some point soon then I'll take a closer look. > If I don't get anywhere then we're no worse off. It's ok .. maybe next week I can at least validate and try to reproduce the issue. If it is reproducible, I agree .. I may be able to handle it 'quicker'. So .. let's say till next week on this issue.
> Whats 'exclusive access policies' ? As in an understanding about what resources need to be protected with locks and why. "Critical sections" might have been a better term to use. Judging by the backtrace there is some of that going on here. It is often a recipe for disaster when a less-familiar-with-the-project-yet-generally-not-too-afraid-of-changing-stuff individual (such as myself in this case :-) ) comes along and starts swinging a bat around in code whose motivations are not well understood. As mentioned in one of my previous blurbs, on my machine at least the test app was able to reproduce the issue reliably, but not deterministically. So... hopefully if you are patient with the test app it eventually will reward you with a lockup. Not sure why it is so much quicker to lock up when it's SWT running in a RCP app. In that case I can pretty much get the lockup within 5 seconds. The standalone SWT test app on the other hand may require up to a minute. But... the SWT test app is so much simpler than the RCP test app it seemed like that's what I should provide. Plus I needed to be sure RCP itself wasn't causing the problem, though clearly its presence has an effect.
reproduced, currently analyzing deadlock which is due to running a mutable NEWT Window operation not on it's EDT.
Reliable reproduction of deadlock: http://jogamp.org/git/?p=jogl.git;a=commit;h=a349db5086a7be7dd80fc2ad29a8a4b55f343e01 Fix: http://jogamp.org/git/?p=jogl.git;a=commit;h=f24844c5e6c57a43df79224f2d3a89e9720726f7
Following JOGL commits complements the SWT/AWT deadlock fix b6fa407d4bf19ef9fe387454b5eeca68853532b9 8cf694c1424277e6358039a964ecd75c54cf9af9 17dd761d7c2b224f0505a399bf4ecb18634e9250 Commit 8cf694c1424277e6358039a964ecd75c54cf9af9 also allows usage of SWT 4.3.0
I've been unable to reproduce the deadlock after the fix, so that part is good. However, assuming all my builds are good, the problem discussed at the forum link below appears to have come back: http://forum.jogamp.org/Reshape-Problem-Using-NewtCanvasSWT-td4026385.html Not exactly sure of the right way to report this with git, but the last commit I see in my project from Sven's master using "git log" says: commit f25b5c973150252af5c5fbf4ca87b03e2e9aee32 Author: Sven Gothel <sgothel@jausoft.com> Date: Tue Nov 27 19:04:20 2012 +0100
(In reply to comment #9) > I've been unable to reproduce the deadlock after the fix, so that part is good. > However, assuming all my builds are good, the problem discussed at the forum > link below appears to have come back: > > http://forum.jogamp.org/Reshape-Problem-Using-NewtCanvasSWT-td4026385.html > > > Not exactly sure of the right way to report this with git, but the last commit > I see in my project from Sven's master using "git log" says: > > commit f25b5c973150252af5c5fbf4ca87b03e2e9aee32 > Author: Sven Gothel <sgothel@jausoft.com> > Date: Tue Nov 27 19:04:20 2012 +0100 Well, I don't know - just passed all SWT unit tests incl.: <http://jogamp.org/git/?p=jogl.git;a=blob;f=src/test/com/jogamp/opengl/test/junit/jogl/swt/TestNewtCanvasSWTBug628ResizeDeadlock.java;h=a0874e609dfd28da3c220b65948a8cd6943f076d;hb=HEAD> If you still have troubles, pls try to adapt this unit test to reproduce the freeze.
Hi Sven, Just to be sure I was clear... I was *not* having problems with the freeze anymore; that all worked great after your fix for this bug. However, an earlier problem where expose, resize, etc events would not get delivered seems to have reappeared. You had fixed this problem a while back, so it looks like a regression (?). The forum post I mentioned discussed this issue. Can't remember if we filed bug on it. Are you saying that you aren't seeing the event delivery issue in your latest build, and/or you have a test case for it that suggests the problem is not there in a build from your current master?
(In reply to comment #11) > Hi Sven, > > Just to be sure I was clear... I was *not* having problems with the freeze > anymore; that all worked great after your fix for this bug. Ah .. ok, thx a lot - I was confused w/ the many SWT issues :) > > However, an earlier problem where expose, resize, etc events would not get > delivered seems to have reappeared. You had fixed this problem a while back, > so it looks like a regression (?). The forum post I mentioned discussed this > issue. Can't remember if we filed bug on it. No bug report, but I fixed the 'sendReshape' in SWT GLCanvas updateSizeCheck(). > > Are you saying that you aren't seeing the event delivery issue in your latest > build, and/or you have a test case for it that suggests the problem is not > there in a build from your current master? I did overhaul our SWT GLCanvas, see: <http://jogamp.org/git/?p=jogl.git;&a=commit&h=7cb6cf2a9708d3f4e06f2215eb0d06b00fa6cd15> tested manual here on X11 and Windows .. jenkins busy now, let's see.
I just now (12/6/2012, noon local time) sync'd up with your master and tried this again, and at the moment everything appears to be work correctly. No deadlock, and reshapes are happening when they should. This is most excellent. FWIW the last commit I see from you in my clone is 7a6f6b7a5b028e918a843de9fe16c38da75edba9.