Created attachment 850 [details]
I have an application that can have an arbitrary number of NewtCanvasAWT's, each with an associated Animator thread and GLWindow context. These contexts are all shared with a permanent "master" context that is created as a dummy GLAutoDrawable from createDummyAutoDrawable().
This system works very well in 2.3.2 even with many contexts. However, in 2.4.0 I am getting massive, intermittent hangs when the library tries to make each Animator's context current.
This change in behavior can be seen clearly in the TestSharedContextListNEWT2 junit test (at least in the case of my Windows 10 x64 platform).
The "TestSharedContextListNEWT2 - 2.X.X results.log" files show my results from the junit test in the corresponding build as well as my system info. It can be seen that 2.4.0 takes 4.135 sec to complete vs 2.773 sec for 2.2.3. Visually, the animation of the gears stutters in 2.4.0.
In the "TestSharedContextListNEWT2 - 2.X.X profiling.png" files, I extend the animation time of TestSharedContextListNEWT2 to gather some profiling information. It can be seen that 2.4.0 is spending about three times as long in the makeCurrent() logic as in 2.3.2.
I have not tested on platforms other than Windows x64. Please let me know if there is other information needed or other testing I can do.
This issue goes away when the number of contexts/Animators is reduced to one. Has something changed with the management of multiple contexts in 2.4.0? Possibly something that is resulting in more calls to makeCurrent(), or more inter-thread blocking while doing so?
Created attachment 851 [details]
TestSharedContextListNEWT2 - 2.3.2 results.log
Created attachment 852 [details]
TestSharedContextListNEWT2 - 2.4.0 results.log
Created attachment 853 [details]
TestSharedContextListNEWT2 - 2.3.2 profiling.png
Created attachment 854 [details]
TestSharedContextListNEWT2 - 2.4.0 profiling.png
It would be fine if you could test several release candidates to help us to find approximately when the regression was introduced.
(In reply to Julien Gouesse from comment #5)
The regression occurs in gluegen commit 330dad069dee5a0cc0480cf5cd9052000004223a.
I have jogl built to commit 52cc68870604e406a7225d23f563ae035299dadc when testing this.
Are you really sure this commit causes this bug?
(In reply to Julien Gouesse from comment #8)
Yes. I was very surprised to. I have been looking more into it.
I can revert all of the changes to the previous commit except for build.xml and Platform.java and still have the regression. These two files were responsible for the gluegen-rt.dll -> gluegen_rt.dll change. This implies the .dll name change is itself causing the issue. I do not understand why.
To reiterate, it doesn't appear to have anything to do with these aspects of the commit:
- Recognize new Java9+ version string as of JEP 223
- Added 'JNI_OnLoad_gluegen_rt' to recognize statical linked JNI code
- Added JNI_VERSION_1_8 to jni/jni.h
I have further confirm that the .dll name is the issue.
I can build the latest version of gluegen and jogl with the single change of
private static final String libBaseName = "gluegen-rt";
in Platform.java (instead of "gluegen_rt"), and then manually rename the gluegen .dll that is produced by the build script from gluegen_rt.dll to gluegen-rt.dll. This fixes the issue.
I do not have enough experience with Java library loading to understand what is happening here. Maybe there are special considerations that occur when the .jar and .dll names match exactly?
I do not think it is JEP 223 related since I had the problem even when I removed JVM_JNI8.c, which I believe should prevent the static linking.
Is it still reproducible with Java 17?
I browsed through GLContext* changes, nothing much to see there.
Then you also claim in comment 7 - comment 10
that it seems to be a 'weird sideeffect' of some sorts?
Glad we have a long test history with our CI,
I do not see a performance regression here
Can you reproduce this w/ Java17 and
Please reopen for version 2.5.0 if persisting.