Summary: | Hardening Condition-Wait from Spurious-Wakeups and unintended InterruptedException(s) | ||
---|---|---|---|
Product: | [JogAmp] General | Reporter: | Xerxes Rånby <xerxes> |
Component: | source_code | Assignee: | Sven Gothel <sgothel> |
Status: | RESOLVED FIXED | ||
Severity: | major | ||
Priority: | P2 | ||
Version: | 2.3.2 | ||
Hardware: | All | ||
OS: | all | ||
Type: | --- | SCM Refs: |
5f5553f1c0b6731970db6df24d79654661238247
d04841599ab2ed181f081ff7fdd38ac4ef65ca34
34d54a9af4413eab840ef9055400e2f5845b4f3a
3e5c410e1d0ae42c68a7ab1342a7da96ef523a4b
e1bf8dd6a24796589bcfcdc6dd66c5a4911d4dcd
cf9b31c30de3768447b20d6aa31ec1df00595871
gluegen 47495cd2a228534578731346c8baf2b190bcd241
gluegen 2b97ed6d238a0db1e2cf7bdf15349e90eaaa8dfc
gluegen 1c4e2d3ea379fe6578dfb84e10f22729b71b1ae5
gluegen 3e40d97a9a7a60e746b3703d2c7d3f4884159a52
jogl 68c8e39fa8d6e700f0a99241c1a01a435b7f6284
jogl 9c23064a6df1e3ef66a715759ee801a80ef516bd
jogl 60e906330feaac485dfea60734573703a3973e36
|
Workaround: | --- | ||
Bug Depends on: | 1213 | ||
Bug Blocks: |
Description
Xerxes Rånby
2015-08-31 16:04:01 CEST
A good discussion about InterruptedException <http://www.ibm.com/developerworks/library/j-jtp05236/>. +++ It doesn't seem to be our code interrupting and throwing the InterruptedException .. However, we do block at this point, waiting for the task to be finished. Some other thread interrupts us. Telling us, to stop - in a not so nice way :) The question is now, how shall we deal w/ it? Note: The method in question returns a boolean for success! Solution-1: One way could be to declare the method 'throws InterruptedException', but this forces the user to handle it. This may lead to bloated code catching the InterruptedException, while not really dealing w/ it. Note: We are using the EDTUtil a lot in our code base. Solution-2: One solution in above discussion is to simply set the interrupt state of the thread again, allowing the user to learn about it, while not changing the API (throws InterruptedException). However, since the interrupted state is utilized by a lot of modules, it might cause undesired sideeffects and 'false negatives'. - Method will not propagate InterruptedException (via RuntimeException)! - Method shall return FALSE if interrupted?! Solution-3: We could also _add_ a method to ask our EDTUtil whether the last call was interrupted, this way we would keep the Thread instance clean. - Method will not propagate InterruptedException (via RuntimeException)! - Method shall return FALSE if interrupted?! Option-1: Optionally, we could add an 'EDTUTil.UncaughtExceptionHandler' allowing the user or our code to deal w/ such events. Hence .. I prefer Solution-3 here maybe plus Option-1. This way the application continues to operate and there are means to learn about the event (InterruptedException). Also spurious-wakeups may ask for our thread-wait code to become more robust: <http://opensourceforgeeks.blogspot.in/2014/08/spurious-wakeups-in-java-and-how-to.html> Example of robust thread-wait is:SharedResourceRunner.doAndWait(..). Here an InterruptedException is swallowed (well -> Noncancelable task) and the condition is tested in a while loop. (In reply to comment #3) > Example of robust thread-wait is:SharedResourceRunner.doAndWait(..). > > Here an InterruptedException is swallowed (well -> Noncancelable task) > and the condition is tested in a while loop. Classification =============== Persistent-Wait: wait inside while-loop on condition Simple-Wait: just using wait w/o while-loop on condition Cancelable: Propagating InterruptedException Noncancelable: Swallows InterruptedException, implies Persistent-Wait! Recommendation ================ At least we shall: Simple-Wait -> Persistent-Wait +++ Then we shall think whether the root source of an InterruptedException is intended to end waiting for the condition. The intention is clearly given, if our code interrupts the waiting thread! However, if this is not the case, it is often unknown what and who causes the interrupt. If there is a clear intent for InterruptedException, then the method should of course be Cancelable. If there is no known intent for InterruptedException, then the method may be Noncancelable. Here .. we may utilize at least a timeout? Classification of Object.wait() calls within jogamp ... ======================================================= Robust cases: - Noncancelable + Persistent-Wait - GLMediaPlayerImpl.StreamWorker.run() .. et al. - Animator.run() - SharedResourceRunner.doAndWait() - SharedResourceRunner.start() - SharedResourceRunner.stop() - DefaultEDTUtil.waitUntilIdle() - DefaultEDTUtil.waitUntilStopped() - AWTEDTUtil.waitUntilStopped() - SWTEDTUtil.waitUntilStopped() - Cancelable + Persistent-Wait: - LFRingbuffer.getImpl(..) - LFRingbuffer.waitForFreeSlots(int) - SyncedRingbuffer.getImpl(..) - SyncedRingbuffer.waitForFreeSlots(int) - SharedResourceRunner.run() Weak cases: > Swallows-InterruptedException + Simple-Wait -> *BAD* - DisplayImpl.enqueueEvent(..) - AndroidGLMediaPlayerAPI14.getNextTextureImpl(..) - SingletonInstanceServerSocket.Server.shutdown() - SingletonInstanceServerSocket.Server.start() - com.jogamp.audio.windows.waveout.Mixer.shutdown() - GLWorkerThread.start() This is especially bad, since the InterruptedException is swallowed but the method does not continue to wait until the condition is reached _and_ the caller is *not* informed that the job has not been done. > Cancelable + Simple-Wait - FunctionTask.invoke(..) - RunnableTask.invoke(..) - RunnableTask.invokeOnNewThread(..) - GLDrawableHelper.invoke(..) - GLWorkerThread.invokeAndWait(..) - DefaultEDTUtil.invokeImpl(..) - AWTEDTUtil.invokeImpl(..) - SWTEDTUtil.invokeImpl(..) - OSXUtil.RunOnMainThread(..) - For the one issue 'DefaultEDTUtil.invokeImpl(DefaultEDTUtil.java:238)', I will override Thread.interrupt() to see who causes it .. I can still re-produce this bug using http://jogamp.org/deployment/archive/master/gluegen_886-joal_612-jogl_1433-jocl_1079-signed/ my best recipe to re-produce it is by running: while [ $? -ne 1 ]; do java -cp jogamp-fat.jar com.jogamp.newt.opengl.GLWindow; done and while it is running try use other applications on the system. the program will stop when it has found the bug and the stack trace is the same as in the first description. Requesting: GLCaps[rgba 8/8/8/0, opaque, accum-rgba 0/0/0/0, dp/st/ms 16/0/0, dbl, mono , hw, GLProfile[GL4bc/GL4bc.hw], on-scr[.]] Exception in thread "main" java.lang.RuntimeException: java.lang.InterruptedException at jogamp.newt.DefaultEDTUtil.invokeImpl(DefaultEDTUtil.java:249) at jogamp.newt.DefaultEDTUtil.invoke(DefaultEDTUtil.java:163) at jogamp.newt.DisplayImpl.runOnEDTIfAvail(DisplayImpl.java:427) at jogamp.newt.WindowImpl.runOnEDTIfAvail(WindowImpl.java:2736) at jogamp.newt.WindowImpl.setSize(WindowImpl.java:1353) at com.jogamp.newt.opengl.GLWindow.setSize(GLWindow.java:588) at com.jogamp.newt.opengl.GLWindow.main(GLWindow.java:1081) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:503) at jogamp.newt.DefaultEDTUtil.invokeImpl(DefaultEDTUtil.java:238) ... 6 more Adding unit test to identify Thread.interrupt() caller for DefaultEDTUtil.invokeImpl(..) wait interruption: 5f5553f1c0b6731970db6df24d79654661238247 d04841599ab2ed181f081ff7fdd38ac4ef65ca34 34d54a9af4413eab840ef9055400e2f5845b4f3a 3e5c410e1d0ae42c68a7ab1342a7da96ef523a4b e1bf8dd6a24796589bcfcdc6dd66c5a4911d4dcd .. Tested on GNU/Linux desktop machines (also our test node w/ nvidia, producing the issue sometimes) with jre8, jre7 and jre6. No interruption detected .. sadly. (In reply to comment #6) > my best recipe to re-produce it is by running: > > while [ $? -ne 1 ]; do java -cp jogamp-fat.jar > com.jogamp.newt.opengl.GLWindow; done > > and while it is running try use other applications on the system. the > program will stop when it has found the bug and the stack trace is the same > as in the first description. > > Requesting: GLCaps[rgba 8/8/8/0, opaque, accum-rgba 0/0/0/0, dp/st/ms > 16/0/0, dbl, mono , hw, GLProfile[GL4bc/GL4bc.hw], on-scr[.]] > Exception in thread "main" java.lang.RuntimeException: > java.lang.InterruptedException > at jogamp.newt.DefaultEDTUtil.invokeImpl(DefaultEDTUtil.java:249) > at jogamp.newt.DefaultEDTUtil.invoke(DefaultEDTUtil.java:163) > at jogamp.newt.DisplayImpl.runOnEDTIfAvail(DisplayImpl.java:427) > at jogamp.newt.WindowImpl.runOnEDTIfAvail(WindowImpl.java:2736) > at jogamp.newt.WindowImpl.setSize(WindowImpl.java:1353) > at com.jogamp.newt.opengl.GLWindow.setSize(GLWindow.java:588) > at com.jogamp.newt.opengl.GLWindow.main(GLWindow.java:1081) > Caused by: java.lang.InterruptedException > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:503) > at jogamp.newt.DefaultEDTUtil.invokeImpl(DefaultEDTUtil.java:238) > ... 6 more *02:11:23 PM) xranby: java -version (02:11:23 PM) xranby: java version "1.7.0_79" (02:11:23 PM) xranby: OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1) (02:11:23 PM) xranby: OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode) (02:11:33 PM) sgothel: on x86_64? (02:11:41 PM) xranby: yes x86_64 But no luck here .. so far. The new unit test works similar as GLWindow.main .., which here is interrupted even before window visibility! Running 2 JVMs in parallel sometimes interrupt the test-thread: [1]: sun.misc.Resource.getBytes(Resource.java:149) [2]: java.net.URLClassLoader.defineClass(URLClassLoader.java:273) as caught w/ a modified version of the test case: Caught Ignored MyInterruptedException: MyMainThread.testInterrupted -> TRUE (current true, counter 13) on thread MyMainThread [0]: com.jogamp.opengl.test.junit.newt.TestBug1211IRQ00NEWT$MyThread.testInterrupted(TestBug1211IRQ00NEWT.java:132) [1]: com.jogamp.opengl.test.junit.newt.TestBug1211IRQ00NEWT.test00(TestBug1211IRQ00NEWT.java:233) [2]: com.jogamp.opengl.test.junit.newt.TestBug1211IRQ00NEWT$3.run(TestBug1211IRQ00NEWT.java:352) [3]: java.lang.Thread.run(Thread.java:679) [4]: com.jogamp.opengl.test.junit.newt.TestBug1211IRQ00NEWT$MyThread.run(TestBug1211IRQ00NEWT.java:147) Caused[1] by Throwable: MyMainThread.interrupt() ************************************************* - count 13 on thread MyMainThread [0]: com.jogamp.opengl.test.junit.newt.TestBug1211IRQ00NEWT$MyThread.interrupt(TestBug1211IRQ00NEWT.java:153) [1]: sun.misc.Resource.getBytes(Resource.java:149) [2]: java.net.URLClassLoader.defineClass(URLClassLoader.java:273) [3]: java.net.URLClassLoader.access$000(URLClassLoader.java:73) [4]: java.net.URLClassLoader$1.run(URLClassLoader.java:212) [5]: java.security.AccessController.doPrivileged(Native Method) [6]: java.net.URLClassLoader.findClass(URLClassLoader.java:205) [7]: java.lang.ClassLoader.loadClass(ClassLoader.java:321) [8]: sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) [9]: java.lang.ClassLoader.loadClass(ClassLoader.java:266) [10]: com.jogamp.opengl.test.junit.newt.TestBug1211IRQ00NEWT.test00(TestBug1211IRQ00NEWT.java:185) [11]: com.jogamp.opengl.test.junit.newt.TestBug1211IRQ00NEWT$3.run(TestBug1211IRQ00NEWT.java:352) [12]: java.lang.Thread.run(Thread.java:679) [13]: com.jogamp.opengl.test.junit.newt.TestBug1211IRQ00NEWT$MyThread.run(TestBug1211IRQ00NEWT.java:147) (In reply to comment #9) ubuntu1~12.04.1 java version "1.6.0_24" OpenJDK Runtime Environment (IcedTea6 1.11.5) (6b24-1.11.5-0ubuntu1~12.04.1) OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) Note: Not occurring on Debian8 w/ Oracle 1.8.0_60 (In reply to comment #9) > Running 2 JVMs in parallel sometimes interrupt the test-thread: > [1]: sun.misc.Resource.getBytes(Resource.java:149) > [2]: java.net.URLClassLoader.defineClass(URLClassLoader.java:273) > It seems this is a bug within the JRE, i.e. not clearing the interrupt state? Ignoring this incident allows the application to continue properly. Retesting w/ Ubuntu 14.04 It also seems that an interrupt() is issued at RecursiveThreadGroupLockImpl01Unfairish.unlock(), see below. This issue is tracked as Bug 1213. Caught causing Throwable: MyMainThread.interrupt() ************************************************* - count 1 on thread MyMainThread-SharedResourceRunner [0]: com.jogamp.opengl.test.junit.newt.TestBug1211IRQ00NEWT$MyThread.interrupt(TestBug1211IRQ00NEWT.java:151) [1]: jogamp.common.util.locks.RecursiveThreadGroupLockImpl01Unfairish.unlock(RecursiveThreadGroupLockImpl01Unfairish.java:195) [2]: jogamp.common.util.locks.RecursiveLockImpl01Unfairish.unlock(RecursiveLockImpl01Unfairish.java:268) [3]: com.jogamp.opengl.GLProfile.initSingleton(GLProfile.java:243) [4]: com.jogamp.opengl.GLProfile.isAvailable(GLProfile.java:300) [5]: jogamp.opengl.egl.EGLDrawableFactory$SharedResourceImplementation.mapAvailableEGLESConfig(EGLDrawableFactory.java:666) [6]: jogamp.opengl.egl.EGLDrawableFactory$SharedResourceImplementation.createEGLSharedResourceImpl(EGLDrawableFactory.java:613) [7]: jogamp.opengl.egl.EGLDrawableFactory$SharedResourceImplementation.createSharedResource(EGLDrawableFactory.java:516) [8]: jogamp.opengl.SharedResourceRunner.run(SharedResourceRunner.java:322) [9]: java.lang.Thread.run(Thread.java:679) (In reply to comment #12) > It also seems that an interrupt() is issued > at RecursiveThreadGroupLockImpl01Unfairish.unlock(), see below. > > This issue is tracked as Bug 1213. Shows that swallowing interrupts is never a good idea. In case of comment 9, java 1.6 interrupt issue at class-loading, we shall keep exposing these kind of bugs IMHO. cf9b31c30de3768447b20d6aa31ec1df00595871: TestBug1211IRQ00NEWT: Cover whole JOGL initialization by MyThread, intercepting all interrupt() calls used. This disclosed issue Bug 1213, "RecursiveThreadGroupLockImpl01Unfairish.unlock() always interrupts original-owner, even if not waiting at unlock()". So far no InterruptedException nor interrupt() call has been detected by MyThread. gluegen 47495cd2a228534578731346c8baf2b190bcd241 gluegen 2b97ed6d238a0db1e2cf7bdf15349e90eaaa8dfc gluegen 1c4e2d3ea379fe6578dfb84e10f22729b71b1ae5 gluegen 3e40d97a9a7a60e746b3703d2c7d3f4884159a52 jogl 68c8e39fa8d6e700f0a99241c1a01a435b7f6284 jogl 9c23064a6df1e3ef66a715759ee801a80ef516bd Updated classification reflecting current code: - Noncancelable + Persistent-Wait - GLMediaPlayerImpl.StreamWorker thread (changed) - pauses thread in case of intr - Cancelable + Persistent-Wait: - LFRingbuffer.getImpl(..) - LFRingbuffer.waitForFreeSlots(int) - SyncedRingbuffer.getImpl(..) - SyncedRingbuffer.waitForFreeSlots(int) - FunctionTask.invokeOnNewThread(..) (changed) - RunnableTask.invokeOnNewThread(..) (changed) - SharedResourceRunner.run() - SharedResourceRunner.doAndWait() (changed) - SharedResourceRunner.start() (changed) - SharedResourceRunner.stop() (changed) - GLMediaPlayerImpl.StreamWorker ctor (changed) - GLMediaPlayerImpl caller thread actions do*() (changed) - AndroidGLMediaPlayerAPI14.getNextTextureImpl(..) (changed) - DisplayImpl.enqueueEvent(..) (changed) -> Persistent-Wait -> Cancels wait and NEWTEvent -> dispatchMessage(NEWTEventTask): always notifyCaller! - GLDrawableHelper.invoke(..) (changed) - DefaultEDTUtil.waitUntilIdle() (changed) - DefaultEDTUtil.waitUntilStopped() (changed) - DefaultEDTUtil.invokeImpl(..) (changed) - DefaultEDTUtil.NEDT.run(..) (changed) - AWTEDTUtil.waitUntilStopped(..) (changed) - AWTEDTUtil.invokeImpl(..) (changed) - AWTEDTUtil.NEDT.run(..) (changed) - SWTEDTUtil.invokeImpl(..) (changed) - SWTEDTUtil.waitUntilStopped(..) (changed) - SWTEDTUtil.NEDT.run(..) (changed) - GLWorkerThread.invokeAndWait(..) - GLWorkerThread.start() (changed) - GLWorkerThread.WorkerRunnable.run() (changed) - Animator.run() (changed) - AnimatorBase.finishLifecycleAction() (changed) - OSXUtil.RunOnMainThread(..) (changed) - SingletonInstanceServerSocket.Server.shutdown() (changed) - SingletonInstanceServerSocket.Server.start() (changed) - com.jogamp.audio.windows.waveout.Mixer.shutdown() (changed) - Extending/Using InterruptSource.Thread (changed) - DefaultEDTUtil.NEDT - AWTEDTUtil.NEDT - SWTEDTUtil.NEDT - GLWorkerThread.thread - Mixer.FillerThread - Mixer.MixerThread - Using InterruptSource.Thread (changed) - TempFileCache - LauncherTempFileCache - Animator.thread - SingletonInstanceServerSocket.Server.serverThread Deprecated: - FunctionTask.invoke(..) (changed) -> on current thread, no wait -> deprecated - RunnableTask.invoke(..) (changed) -> on current thread, no wait -> deprecated gluegen 47495cd2a228534578731346c8baf2b190bcd241 gluegen 2b97ed6d238a0db1e2cf7bdf15349e90eaaa8dfc gluegen 1c4e2d3ea379fe6578dfb84e10f22729b71b1ae5 gluegen 3e40d97a9a7a60e746b3703d2c7d3f4884159a52 Note, above GlueGen commit-messages have a typo: 'Bug 1213' - should read 'Bug 1211' commit 60e906330feaac485dfea60734573703a3973e36 Bug 1211: Refine NEWTDemoListener, JOGLNewtAppletBase - NEWTDemoListener.createPointerIcons(..) - Use Display instance - Simplify PointerIcon creation using a list, skipping all non-found resources. - JOGLNewtAppletBase - Bring back reparent action via key 'r' - Drop redundant PointerIcon, using NEWTDemoListener An interesting caught InterruptedException by an applet w/ source by the new code: Caught handled Caught handled SourcedInterruptedException: null on thread thread applet-com.jogamp.newt.awt.applet.JOGLNewtApplet1Run-2-SharedResourceRunner SourcedInterruptedException: null on thread thread applet-com.jogamp.newt.awt.applet.JOGLNewtApplet1Run-2-SharedResourceRunner [0]: com.jogamp.common.util.SourcedInterruptedException.wrap(SourcedInterruptedException.java:82) [1]: com.jogamp.common.util.SourcedInterruptedException.wrap(SourcedInterruptedException.java:63) [2]: jogamp.opengl.SharedResourceRunner.run(SharedResourceRunner.java:338) [0]: com.jogamp.common.util.SourcedInterruptedException.wrap(SourcedInterruptedException.java:82) [3]: java.lang.Thread.run(Thread.java:745) [1]: com.jogamp.common.util.SourcedInterruptedException.wrap(SourcedInterruptedException.java:63) [2]: jogamp.opengl.SharedResourceRunner.run(SharedResourceRunner.java:338) Caused[1] by InterruptedException: null on thread thread applet-com.jogamp.newt.awt.applet.JOGLNewtApplet1Run-2-SharedResourceRunner [3]: java.lang.Thread.run(Thread.java:745) [0]: java.lang.Object.wait(Native Method) [1]: java.lang.Object.wait(Object.java:502) Caused[1] by InterruptedException: null on thread thread applet-com.jogamp.newt.awt.applet.JOGLNewtApplet1Run-2-SharedResourceRunner [2]: jogamp.opengl.SharedResourceRunner.run(SharedResourceRunner.java:334) [3]: java.lang.Thread.run(Thread.java:745) [0]: java.lang.Object.wait(Native Method) [1]: java.lang.Object.wait(Object.java:502) [2]: jogamp.opengl.SharedResourceRunner.run(SharedResourceRunner.java:334) [3]: java.lang.Thread.run(Thread.java:745) InterruptSource[2] by Throwable: thread applet-com.jogamp.newt.awt.applet.JOGLNewtApplet1Run-2-SharedResourceRunner.interrupt() #1 on thread thread applet-com.jogamp.newt.awt.applet.JOGLNewtApplet1Run-2-SharedResourceRunner InterruptSource[2] by Throwable: thread applet-com.jogamp.newt.awt.applet.JOGLNewtApplet1Run-2-SharedResourceRunner.interrupt() #1 on thread thread applet-com.jogamp.newt.awt.applet.JOGLNewtApplet1Run-2-SharedResourceRunner [0]: com.jogamp.common.util.InterruptSource$Thread.interrupt(InterruptSource.java:152) [0]: com.jogamp.common.util.InterruptSource$Thread.interrupt(InterruptSource.java:152) [1]: java.lang.ThreadGroup.interrupt(ThreadGroup.java:639) [1]: java.lang.ThreadGroup.interrupt(ThreadGroup.java:639) [2]: sun.awt.AppContext.dispose(AppContext.java:505) [2]: sun.awt.AppContext.dispose(AppContext.java:505) [3]: com.sun.deploy.uitoolkit.impl.awt.AWTAppContext$AppContextDisposer.run(Unknown Source) [3]: com.sun.deploy.uitoolkit.impl.awt.AWTAppContext$AppContextDisposer.run(Unknown Source) [4]: java.lang.Thread.run(Thread.java:745) [4]: java.lang.Thread.run(Thread.java:745) |