| Summary: | Remove XInitThreads() (and hence XLockDisplay/XUnlockDisplay) - Make X11 Display usage lock free. | ||
|---|---|---|---|
| Product: | [JogAmp] Jogl | Reporter: | Sven Gothel <sgothel> |
| Component: | x11 | Assignee: | Sven Gothel <sgothel> |
| Status: | RESOLVED FIXED | ||
| Severity: | enhancement | CC: | alexander.lex, andres.colubri, bforrester, dp, gouessej, wwalker3, xerxes |
| Priority: | P1 | ||
| Version: | 2 | ||
| Hardware: | All | ||
| OS: | all | ||
| Type: | --- | SCM Refs: |
jogl fbe331f013608eb31ff0d8675f4e4c9881c9c48b
jogl 21b85e647f1e661c8e5e49caa91c564a3d041df2
jogl f7ca4df7b4e6b2acbdb941f67f6d7058db134d91
|
| Workaround: | --- | ||
| Bug Depends on: | 623 | ||
| Bug Blocks: | 603 | ||
Added people to CC list who could be helping making a decision .. See branch x11_no_xinitthreads. I have tested the new branch on OpenIndiana x86_64 NV and Linux x86_64 (NV, AMD, Intel) with our elaborate unit tests. So far .. no downsides. Ofc some NEWT Display lifecycle tests fail. No deadlock or whatsoever. The total test time is similar, a few milliseconds faster than the master branch, but this delta could be within the measurement inaccuracy. Hi (In reply to comment #0) > Sorry for my very long writeup, but since the matter is quite complex > and important to our X11 port I believe being more verbose and detailed > is hurting less as if being too brief. No, it's fine for me, thank you very much. (In reply to comment #3) > So far .. no downsides. Ofc some NEWT Display lifecycle tests fail. > No deadlock or whatsoever. > Which one? Can you explain a bit why the new branch uses more display resources? Is it because you don't reuse some handles? I can give it a try but I have no heavy test case, a first person shooter using a single NEWT window will probably not exhibit problematic behaviours (but I can use some other Ardor3D testcases). If we really know what we are doing, we should do this change in JOGL 2.0, not in another major version. If it is too risky, we should postpone it to the next minor release. (In reply to comment #4) > Hi > > (In reply to comment #3) > > So far .. no downsides. Ofc some NEWT Display lifecycle tests fail. > > No deadlock or whatsoever. > > > Which one? com.jogamp.opengl.test.junit.newt.TestDisplayLifecycle02NEWT com.jogamp.opengl.test.junit.newt.TestWindows01NEWT Both check whether the NEWT Display is being reused .. etc. So breakage here is expected. > > Can you explain a bit why the new branch uses more display resources? Is it > because you don't reuse some handles? B/c [1] we don't reuse the NEWT Display itself, hence each NEWT window will at least have one more DPY for the surface - we may can reuse the EDT and it's DPY, if we like that idea. The latter would reduce the costs of DPY by 50% and reproduce current behavior of having one EDT w/ it's DPY for one physical connection. > > I can give it a try but I have no heavy test case, a first person shooter using > a single NEWT window will probably not exhibit problematic behaviours (but I > can use some other Ardor3D testcases). That would be something, yes. Maybe some troubles disappear ? > > If we really know what we are doing, we should do this change in JOGL 2.0, not > in another major version. If it is too risky, we should postpone it to the next > minor release. Agreed. That is why I am so verbose here .. and trying to get make a risk assessment. Highly likely this approach will remove currently existing bugs appearing within 'environments'. Sure this lock free approach will cost a few DPYs .. but as you mentions, this is not your everyday scenario (w/ 100 windows open -> 101 or 200 DPY .. etc). (In reply to comment #5) > That would be something, yes. Maybe some troubles disappear ? > Is there an aggregated build of this branch anywhere? I would like to test all canvases of Ardor3D with this branch, some test cases with multiple canvases (especially the one with 6 canvases). I can work a bit on that during this night or tomorrow evening. I think it will be great if we can get rid of XInitThreads(). That call makes it very difficult to use JOGL safely inside an Eclipse RCP app, since Eclipse itself is drawing stuff (or at least making XWindows calls) before any JOGL code can be invoked. In most app frameworks, it's probably very difficult to insure that JOGL's static initializer is called before any part of the framework does anything with XWindows. Also, as others have pointed out in the forum, there's no documentation saying that it's safe to call XInitThreads() more than once, so it's probably best not to call it at all. I will test the new branch soon, I just have to travel for work next week so I can't test it right away. (In reply to comment #7) > > I will test the new branch soon, I just have to travel for work next week so I > can't test it right away. Please let me know if you plan to test it under Cent OS Linux too. I'm happy to report that we tested it on our code base and it seems to fix the threading errors we were receiving. Should we expect to see this branch merged into rc11 ?? (In reply to comment #9) > I'm happy to report that we tested it on our code base and it seems to fix the > threading errors we were receiving. Should we expect to see this branch merged > into rc11 ?? Tested under Ubuntu 10.x and 12.x 64 bit.. (In reply to comment #9) > Should we expect to see this branch merged > into rc11 ?? We have to test it a bit more and Sven will have the last word. I agree w/ all of you .. i.e. a bit more testing and favor the merge due to simplified X11 locking, read: X11 lock-free. Plus, it removes the XInitThreads() issue completly .. great. Yes, it will be in RC11 .. more testing will be done while merging it, I will keep another branch alive with xinitthreads .. so we can compare, just in case. Thank you for testing! ~Sven Commit fbe331f013608eb31ff0d8675f4e4c9881c9c48b <http://jogamp.org/git/?p=jogl.git;a=commit;h=fbe331f013608eb31ff0d8675f4e4c9881c9c48b>: +++ Fix Bug 616: X11: Remove XInitThreads() dependency while cleaning up device locking, resulting in a native-lock-free impl. The X11 implementation details of NativeWindow and NEWT used the X11 implicit locking facility XLockDisplay/XUnlockDisplay, enabled via XInitThreads(). The latter useage is complicated within an unsure environment where the initialization point of JOGL is unknown, but XInitThreads() requires to be called once and before any other X11 calls. The solution is simple and thorough, replace native X11 locking w/ 'application level' locking. Following this pattern actually cleans up a pretty messy part of X11 NativeWindow and NEWT, since the generalization of platform independent locking simplifies code. Simply using our RecursiveLock also speeds up locking, since it doesn't require JNI calls down to X11 anymore. It allows us to get rid of X11ToolkitLock and X11JAWTToolkitLock. Using the RecursiveLock also allows us to remove the shortcut of explicitly createing a NullToolkitLocked device for 'private' display connections. All devices use proper locking as claimed in their toolkit util 'requiresToolkitLock()' in X11Util, OSXUtil, .. Further more a bug has been fixed of X11ErrorHandler usage, i.e. we need to keep our handler alive at all times due to async X11 messaging behavior. This allows to remove the redundant code in X11/NEWT. The AbstractGraphicsDevice lifecycle has been fixed as well, i.e. called when closing NEWT's Display for all driver implementations. On the NEWT side the Display's AbstractGraphicsDevice semantics has been clarified, i.e. it's usage for EDT and lifecycle operations. Hence the X11 Display 2nd device for rendering operations has been moved to X11 Window where it belongs - and the X11 Display's default device used for EDT/lifecycle-ops as it should be. This allows running X11/NEWT properly with the default usage, where the Display instance and hence the EDT thread is shared with many Screen and Window. Rendering using NEWT Window is decoupled from it's shared Display lock via it's own native X11 display. Lock free AbstractGraphicsDevice impl. (Windows, OSX, ..) don't require any attention in this regard since they use NullToolkitLock. Tests: ====== This implementation has been tested manually with Mesa3d (soft, Intel), ATI and Nvidia on X11, Windows and OSX w/o any regressions found in any unit test. Issues on ATI: ============== Only on ATI w/o a composite renderer the unit tests expose a driver or WM bug where XCB claims a lack of locking. Setting env. var 'LIBXCB_ALLOW_SLOPPY_LOCK=true' is one workaround if users refuse to enable compositing. We may investigate this issue in more detail later on. Reopened due to Bug 623 [dependency]. <http://forum.jogamp.org/Pls-test-new-JOGL-aggregated-build-pre-RC11-no-XInitThreads-td4026312.html> Pls test .. despite Bug 623 .. Using workaround of Bug 623 remedy .. Added fix 21b85e647f1e661c8e5e49caa91c564a3d041df2
Fix commit fbe331f013608eb31ff0d8675f4e4c9881c9c48b (Bug 616 - Remove XInitThreads())
X11Util/Native: Fix X11Util_initialize0() arguments were wrong and code still invoked XInitThreads() .. woops;
Added missing included "jogamp_nativewindow_x11_X11Util.h" incl. it's generation via javah,
which was the culprit of not detecting it at compile time.
This is a fix for commit fbe331f013608eb31ff0d8675f4e4c9881c9c48b
Fixed regression of getDefaultToolkitLock() usage, commit f7ca4df7b4e6b2acbdb941f67f6d7058db134d91 |
Remove XInitThreads() (and hence XLockDisplay/XUnlockDisplay) - Make X11 Display usage lock free. It is known that calling XInitThreads() may cause confusion w/ other toolkits like GDK/GTK and if running within other [Java] IDEs where it is unclear what the X11 locking policy is - or whether [X11] code did already run. In case of the latter, a call to XInitThreads() is a violation of the spec. Basics: X11 is network based, using fast local transfers or other slow media. A X11 display connection (DPY) communicates w/ a device (X server) which may be local (fast) or remote (slow). One site can have multiple connection names, e.g. X server etc. There is a DPY limit, 255 connections are allowed _to_ one X11 server (?) 1 DPY gives you one communication queue to the server, each is intrinsically thread safe if used by one thread at a time. If sharing a DPY one would either need to use [1] XLockDisplay/XUnlockDisplay, which requires XInitThreads to work [2] Application level locking Depending on the impl. [1] is quite fast today (XCB backend). A bad example of [2] is AWT, i.e. using one global lock for the whole toolkit (on X11). In general it would be a very bad idea to use one DPY for the whole 'location'. Stuttering and other blocking artifacts would be experienced .. as you are used w/ the one-for-all EDT architecture as experienced w/ AWT, Cocoa, SWT, .. Since the goal already achieved is to use one 'display' w/ multiple rendering instances w/o blocking, we can either use: [a] one DPY for each thread and use case [b] X11 locking The master branch uses a mix of [a] and [b] already, where a shift towards [a] happened a while ago in the offscreen surface and drawable implementation. Now lets have a view back in history .. - In the beginning .. JOGL1, followed more the EDT approach utilizing the ThreadUtil. Of course, at this time AWT and Swing were the only drawable surfaces available. You could disable the EDT funneling via properties, they still exist. However, since AWT is not implemented to be thread safe, but single threaded - your mileage may vary (YMMV). - On AWT we lock surface and the so called JAWT global lock. Hence rendering in multiple AWT surfaces (GLCanvas) from multiple threads will serve you a stuttering experience. You will also notice a lag of input, since the global lock synchronizes both, the input event queue and the surfaces - all resources. - We created NEWT and hence AWT constraints were no more required to follow. We were asked which threading model NEWT follows. The simple answer was none at all, i.e. it should be thread safe. In the end - due to conveniences, the NEWT EDT semantic was born. NEWT's EDT intend was to only dispatch events from the native queue. Nevertheless we were required to also use it to handle the native windowing lifecycle actions like create / destroy windows. In the end .. as it is of today, you can use a NEWT surface to render into from any thread, while locking it's surface and DPY. Since we used [b], a shared DPY w/ XInitThreads, we are thread-safe and resource efficient. Still .. having multiple NEWT windows operating on one shared DPY may reduce some responsiveness. This is the balance to discuss here. The current DPY situation regarding locking, open and close .. - NVidia plays nice w/ open and closing of display connections - AMD has an open/close order bug, i.e. DPY must be closed in the order they were opened, hence X11Util funnels all DPY open and close ops. If using AMD, we never close the DPY, but reuse them (not too bad). At JVM shutdown we close them in the proper order, which avoids leaks and a SEGV at JVM shutdown .. you may have been bothered with it. - XCB backend introduced another hack, it's locking bug forced us to have 1 DPY per EDT and one per 'renderer'. Read: Both make up one NEWT X11 Display, so a NEWT X11 Display opens _two_ DPY. ++++ The new branch x11_no_xinitthreads, test it! - One good unit test here demonstrating the difference of the master branch [b] and the new x11_no_xinitthreads branch [a] is TestInitConcurrentNEWT, which opens multiple NEWT windows each in it's own thread and renders them. With [b] you will notice that it takes a little time until all windows become established, where with [a] realization of all windows happens almost at the same time even though it takes longer until they all show up at once. The total time delta is almost zero, a few milliseconds pro [a]. However, resource costs for [a] explodes a little bit. 20 NEWT Windows w/ DPY: [b] == 2 DPYs, [a] == 40 DPYs Of course, this is not the regular use case .. As mentioned above, it is to discuss which solution is preferable. PRO x11_no_xinitthreads branch: - more clean and less complex implementation - easier to maintain - neither requiring not bothering w/ XInitThreads, nor XLockDisplay/XUnlockDisplay - no questionable locking parts in implementation - may solve issues folks had w/ IDEs like Eclipse ? E.g. Wade and others reported different behavior when calling JOGL's initialization at a different time. - multiple NEWT windows running [each in their thread] on one display run more in a separate fashion. No interference nor blocking from other same 'display' NEWT windows PRO master branch (old style): - it's tested and used today - uses less DPY resources - much harder to hit the DPY limit of 255 connections to one server The educated gut feeling will favor 'x11_no_xinitthreads'. There has been quite some change in the NEWT X11 Display domain: NEWT: Changed Lifecycle of Display/Screen <http://jogamp.org/git/?p=jogl.git&a=search&h=HEAD&st=commit&s=Changed+Lifecycle+of+Display%2FScreen> also searching for 'X11 Display' reveals above mentioned change sets: <http://jogamp.org/git/?p=jogl.git&a=search&h=HEAD&st=commit&s=X11+Display> My conclusion so far - if you could test the new branch and: - it works well - even fixes maybe a bug (Eclipse, IDEs, .. Wade ?) - the DPY limit doesn't bother you - no other concerns .. then we could move over and clean up accordingly. Sorry for my very long writeup, but since the matter is quite complex and important to our X11 port I believe being more verbose and detailed is hurting less as if being too brief.