Summary: | OpenGL intialization gives fatal error on some Linux distributions | ||
---|---|---|---|
Product: | [JogAmp] Jogl | Reporter: | Mabula Haverkamp <mabula> |
Component: | opengl | Assignee: | Sven Gothel <sgothel> |
Status: | RESOLVED FIXED | ||
Severity: | critical | CC: | gouessej, mabula |
Priority: | P4 | ||
Version: | 2.4.0 | ||
Hardware: | pc_x86_64 | ||
OS: | linux | ||
Type: | DEFECT | SCM Refs: | |
Workaround: | --- | ||
Attachments: |
debug log, ATI FireGL, Mesa 20.0.7
debug_logs_dariusz.krzempek - Turks [Radeon HD 6500/6600 / 6700M Series] debug_logs_Minusman AMD REDWOOD debug_logs_dariusz.krzempek-20200602 Turks [Radeon HD 6500/6600 / 6700M Series] final lines of running application with -Djogl.debug=all |
Description
Mabula Haverkamp
2020-03-02 18:21:14 CET
Which version of JOGL/JogAmp are you using? If you haven't tried, please test with latest, i.e. jogamp-next or 2.4.0-20200202 Hi Sven, I tried both of the 2 latest versions: - JOGL-2.4-rc-20200115 - JOGL-2.4-rc-20200202 In both cases, I used the fat jar. They both produce the bug identically. Kind regards, Mabula It looks like the symptom of the bug 1357: https://jogamp.org/bugzilla/show_bug.cgi?id=1357 Maybe this profile is not really compliant; if it's true, this change will have to be modified: https://jogamp.org/bugzilla//show_bug.cgi?id=1385 I'm under Mageia Linux with Mesa 20 and an Intel chip, I can't reproduce this bug (expected behaviour). If you can, just patch GLContextImpl with JByteMod as I did in February 2018. I see that a special treatment is needed for AMD TURKS and AMD REDWOOD. (In reply to Mabula Haverkamp from comment #2) Please can you try to call this before calling GLProfile.getMaxProgrammable(true)? GLContext.getCurrent().getRendererQuirks().addQuirk(GLRendererQuirks.GL3CompatNonCompliant); (In reply to Julien Gouesse from comment #4) Hi Julien, I guess I should try this code addition and let one of my customers with the issue test if that fixes it on his/her system? Mabula (In reply to Mabula Haverkamp from comment #5) Yes, exactly. If it "fixes" the problem, I'll suggest to revert the fix of the bug 1385 until Mesa provides really compliant GL3 implementation. Otherwise, I'll have to find another solution, which will be challenging without any hardware to reproduce your bug. (In reply to Mabula Haverkamp from comment #5) The debug logs would help me a lot, please use that when running your program: -Dnewt.debug=all -Dnativewindow.debug=all -Djogl.debug=all GLContextImpl.createContextARBMapVersionsAvailable() should reject the non compliant GL3bc implementation with my suggestion (if it's called early enough). I just hope that I really understand the symptom. (In reply to Mabula Haverkamp from comment #5) Please, sorry to insist but this bug is marked "critical", I can't reproduce it on my hardware, post the debug logs as soon as possible to give me a chance to check my assumptions. (In reply to Julien Gouesse from comment #9) Hi Julien, Apologies for not replying sooner. I will try get you feedback from my customers as soon as possible. So if I understand correctly, I need to build my application with the following so we can test it and get you the right feedback: When initializing OpenGL I will call: GLContext.getCurrent().getRendererQuirks().addQuirk(GLRendererQuirks.GL3CompatNonCompliant); GLContextImpl.createContextARBMapVersionsAvailable() GLProfile.getMaxProgrammable(true) Is that correct? And my java application should be started with -Dnewt.debug=all -Dnativewindow.debug=all -Djogl.debug=all right? I have one more question which might be related or not? Would it help if I use this? System.setProperty("jogl.disable.openglcore", "true"); In my current version of my application, I have removed this setting, and now I am getting reports of strange OpenGL3 behaviour also on Windows 10 machines with Intel HD3000 chipsets (where Intel only provides support until Windows7, so Microsoft is probably providing buggy drivers there? ) I guess disabling openGL Core profiles would increase compatibility for more computer configurations, right? My OpenGL implementations don't need it as well. I have implementations for Gl2, GL3 and GL4, so I profile to use one of those on any machine is possible. Thanks, Mabula (In reply to Julien Gouesse from comment #8) Hi Julien, GLContextImpl.createContextARBMapVersionsAvailable() does not seems to be available with JOGL-2.4-rc-20200202 ? Maybe I am doing something wrong, when should this be called? Also before the OpenGL initialization: GLProfile.getMaxProgrammable(true) Mabula Hello jogl.disable.openglcore doesn't help because the affected profiles are backward compatible. Don't call GLContextImpl.createContextARBMapVersionsAvailable(), just call GLContext.getCurrent().getRendererQuirks().addQuirk(GLRendererQuirks.GL3CompatNonCompliant); before getting a profile. Yes, use -Dnewt.debug=all -Dnativewindow.debug=all -Djogl.debug=all. Please use a more recent release candidate of JOGL 2.4.0: https://jogamp.org/deployment/v2.4.0-rc-20200307/ It would be nice to get the debug logs with and without my workaround to understand better what was wrong. (In reply to Julien Gouesse from comment #13) Thank you very much Julien, I will use the more recent JOGL version and will ask my customers to test with and without the proposed GLContext.getCurrent().getRendererQuirks().addQuirk(GLRendererQuirks.GL3CompatNonCompliant); before getting a profile and will ask them to send back all debugging output. Hopefully I can get the results to you within the coming week, thanks. Mabula (In reply to Julien Gouesse from comment #13) Hi Jullien, I have downloaded the new JOGL-2.4-rc-20200307 I have a problem though, I can't call GLContext.getCurrent().getRendererQuirks().addQuirk(GLRendererQuirks.GL3CompatNonCompliant); before GLProfile.getMaxProgrammable(true) at application startup )where i try to initialize my application to use either GL2, GL3 or GL4), because it will give a nullpointer exception. My guess is because there is no GLContext yet? Should I somehow first create a context to add the quirk ? How to do this? Mabula (In reply to Mabula Haverkamp from comment #15) Please give me the logs of your application without using my suggestion first, it would be better than nothing. Don't use my suggestion now, I'll provide a better solution after getting your logs. Hello I've realized that my terribly old laptop has an ATI FireGL GPU, I'm going to try to reproduce this bug. This bug isn't reproducible on my old laptop, I entered that in command line: jdk-14.0.1+7-jre/bin/java -Dnewt.debug=all -Dnativewindow.debug=all -Djogl.debug=all -Djogamp.debug=all -jar jogamp-fat.jar 2> debug_logs_ati_firegl____14.txt I use Mesa 20.0.7. I'll add the log file as an attachment. I ran T.U.E.R too, it worked like a charm. Mabula, can you ask your customer to run the same command (more or less)? Obviously, you can use another version of Java. Created attachment 845 [details]
debug log, ATI FireGL, Mesa 20.0.7
(In reply to Julien Gouesse from comment #18) Dear Julien, I have given my 2 customers with this issue, the instructions to perform the debug command on the fat jar, they usually respond fast so hopefully I can send their debug log to you today or tomorrow ;-) https://www.astropixelprocessor.com/community/linux/app-1-077-not-run-on-kubuntu-18-04-lts/paged/2/#post-10736 Thanks, Mabula Created attachment 846 [details]
debug_logs_dariusz.krzempek - Turks [Radeon HD 6500/6600 / 6700M Series]
(In reply to Mabula Haverkamp from comment #21) First debug log has arrived ;-) Julien (In reply to Mabula Haverkamp from comment #22) Great job, thank you so much. I'll look at your log file carefully and I'll give you some feedback as soon as possible :D Created attachment 847 [details]
debug_logs_Minusman AMD REDWOOD
Hi Julien, I have attached the debug log of my second customer. debug_logs_Minusman.txt Hope this is usefull? Mabula (In reply to Mabula Haverkamp from comment #25) Yes, it's useful. This bug still occurs on Mesa 20.0.4. I'm still looking at the logs to understand how to improve the management of non compliant implementations in this case. I get this exception when running a modified fat JAR made with CafeBabe editor: Exception in thread "main" java.lang.VerifyError: Bad type on operand stack I'll find another solution to provide a patched fat JAR. (In reply to Mabula Haverkamp from comment #25) Please find a patched fat JAR here: http://svn.code.sf.net/p/tuer/code/pre_beta/lib/jogamp/jogamp-fat.jar It's JOGL 2.4 RC without the fix of the bug 1385, i.e it considers that the GL3 backward compatible profile isn't compliant in your case. Please let me know as soon as possible whether it fixes your bug. If it does, I'll suggest to - revert the fix of the bug 1385 to fix the bug 1426 in JOGL 2.4 - implement a smarter mechanism to detect non compliant GL implementations earlier in JOGL 2.5 - commit anew the original fix of the bug 1385 in JOGL 2.5 Sven, what's your opinion about that? (In reply to Julien Gouesse from comment #28) Dear Julien, Thank you very much, should I ask my 2 customers to again perform the java -Dnewt.debug=all -Dnativewindow.debug=all -Djogl.debug=all -Djogamp.debug=all -jar jogamp-fat.jar 2> debug_logs_ati_firegl____14.txt test with this patched jar ? Will that be enough to conclude that it is fixed? Mabula (In reply to Mabula Haverkamp from comment #29) Yes. If my "fix" works, there won't be any internal error. (In reply to Julien Gouesse from comment #30) Hi Julien, Great, I have instructed my customers to perform the task on the newly patched jar file. Hopefully, I will get their new debugging logs quickly. I will attach them as soon as I receive them ;-) Mabula (In reply to Julien Gouesse from comment #30) Dear Julien, Here is the first debug log from Dariusa Krzempek - Turks [Radeon HD 6500/6600 / 6700M Series]. I could not attach this log here, because it is larger than 2000Kb. So you can download it here: https://apastropixelprocessordl.s3.eu-central-1.amazonaws.com/debug_logs_dariusz.krzempek-on-patched-Jar-20200531.txt Mabula (In reply to Mabula Haverkamp from comment #32) It seems to work :) I'm currently trying to find a better fix but this one will be a good fallback in case I find nothing smarter. (In reply to Julien Gouesse from comment #33) That is great, hopefully, the other debug log will confirm the same ;-) I will share it as soon as I receive it. Mabula GLProfile.computeProfileImpl(device, "GL3bc", false, false, isHardwareRasterizer) returns null (which is correct) but GLProfile.computeProfileImpl(device, "GL2", false, false, isHardwareRasterizer) returns "GL3bc" (which is incorrect); if the behaviour of this method was consistent, this bug would be fixed. Mabula, I'll give you another patched version soon because I need to add some log messages into GLProfile. Thank you for your precious help. (In reply to Julien Gouesse from comment #35) Hi Julien, I have the second debug log for the patched jar for you here ;-) : https://apastropixelprocessordl.s3.eu-central-1.amazonaws.com/debug_logs_Minusman-on-patched-Jar-20200531.txt Cheers, Mabula (In reply to Julien Gouesse from comment #35) "Mabula, I'll give you another patched version soon because I need to add some log messages into GLProfile. Thank you for your precious help." Okay, no problem ;-) You are most welcome Julien, I will be waiting for it... (In reply to Mabula Haverkamp from comment #37) Please find a patched fat JAR here: http://svn.code.sf.net/p/tuer/code/pre_beta/lib/jogamp/jogamp-fat.jar It's JOGL 2.4 RC with some new log messages to understand what is wrong with GLProfile.computeProfileImpl(device, "GL2", false, false, isHardwareRasterizer). It should produce the fatal error. (In reply to Julien Gouesse from comment #38) Hi Julien, Thanks, i have instructed my customers to run the debug log again with this newly patched jar file, I will share the logs as soon as I receive them. Mabula Created attachment 848 [details]
debug_logs_dariusz.krzempek-20200602 Turks [Radeon HD 6500/6600 / 6700M Series]
(In reply to Julien Gouesse from comment #38) Dear Julien, Please find attached the new debug log from Dariusz using the jar with new log messages: debug_logs_dariusz.krzempek-20200602 Turks [Radeon HD 6500/6600 / 6700M Series] Cheers, Mabula (In reply to Mabula Haverkamp from comment #41) This line explains why computeProfilImpl returns GL3bc when GL2 is passed: main-SharedResourceRunner: createContextARB-MapGLVersions MAP EGLGraphicsDevice[type .egl, v1.5.0, connection :0, unitID 0, handle 0x7f24c4001830, owner true, ResourceToolkitLock[obj 0x17baae6e, isOwner true, <69379752, 27fe3806>[count 2, qsz 0, owner <main-SharedResourceRunner>]]]: 2 (Compat profile, compat[], hardware) -> 3.1 (Compat profile, arb, compat[ES2, ES3], FBO, hardware) GLContext.isGL2Available() is consistent with GLContext.getAvailableGLProfileName() and both use the entry above: main: getAvailableGLVersion EGLGraphicsDevice[type .egl, v1.5.0, connection :0, unitID 0, handle 0x7f24c4001830, owner true, ResourceToolkitLock[obj 0x17baae6e, isOwner false, <69379752, 27fe3806>[count 0, qsz 0, owner <NULL>]]] 2 2 : 50416643 main: getAvailableGLVersion EGLGraphicsDevice[type .egl, v1.5.0, connection :0, unitID 0, handle 0x7f24c4001830, owner true, ResourceToolkitLock[obj 0x17baae6e, isOwner false, <69379752, 27fe3806>[count 0, qsz 0, owner <NULL>]]] 2 2 : 50416643 main: getAvailableGLProfileName EGLGraphicsDevice[type .egl, v1.5.0, connection :0, unitID 0, handle 0x7f24c4001830, owner true, ResourceToolkitLock[obj 0x17baae6e, isOwner false, <69379752, 27fe3806>[count 0, qsz 0, owner <NULL>]]] 2 2 : GL3bc I have to return GL2 when GLContext.isGL3bcAvailable() returns false. (In reply to Mabula Haverkamp from comment #41) Please find a patched fat JAR here: http://svn.code.sf.net/p/tuer/code/pre_beta/lib/jogamp/jogamp-fat.jar It's JOGL 2.4 RC with another (potential) fix, it tries to detect the non conformant profile lately in computeProfileImpl(). If it fails, I'll have to design a more elaborate solution to detect this case earlier. (In reply to Julien Gouesse from comment #43) Great, I have sent this new jar to my customers and I am waiting again for their new debug logs. Will share them as soon as I receive them ;-) Mabula (In reply to Julien Gouesse from comment #43) Dear Julien, i have 2 new debug logs for you from both customers using the latest patched jar file: https://apastropixelprocessordl.s3.eu-central-1.amazonaws.com/debug_logs_dariusz.krzempek-20200603.txt https://apastropixelprocessordl.s3.eu-central-1.amazonaws.com/debug_logs_minusman_20200603.txt Will be waiting for your findings ;-) Mabula (In reply to Mabula Haverkamp from comment #45) Thank you very much. My fix seems to work. I didn't reproduce this bug on my hardware because it supports only OpenGL 3.0 whereas your customers' hardware supports OpenGL 3.1. I post my patch here, I'll wait for Sven's feedback, I don't want to introduce a side effect: diff --git a/src/jogl/classes/com/jogamp/opengl/GLProfile.java b/src/jogl/classes/com/jogamp/opengl/GLProfile.java index 8612fc73f..9695ef058 100644 --- a/src/jogl/classes/com/jogamp/opengl/GLProfile.java +++ b/src/jogl/classes/com/jogamp/opengl/GLProfile.java @@ -2293,7 +2293,7 @@ public class GLProfile { } else if(GL3 == profile && hasAnyGL234Impl && ( desktopCtxUndef || GLContext.isGL3Available(device, isHardwareRasterizer))) { return desktopCtxUndef ? GL3 : GLContext.getAvailableGLProfileName(device, 3, GLContext.CTX_PROFILE_CORE); } else if(GL2 == profile && hasAnyGL234Impl && ( desktopCtxUndef || GLContext.isGL2Available(device, isHardwareRasterizer))) { - return desktopCtxUndef ? GL2 : GLContext.getAvailableGLProfileName(device, 2, GLContext.CTX_PROFILE_COMPAT); + return desktopCtxUndef || (GL3bc == GLContext.getAvailableGLProfileName(device, 2, GLContext.CTX_PROFILE_COMPAT) && !GLContext.isGL3bcAvailable(device, isHardwareRasterizer)) ? GL2 : GLContext.getAvailableGLProfileName(device, 2, GLContext.CTX_PROFILE_COMPAT); } else if(GLES3 == profile && hasGLES3Impl && ( esCtxUndef || GLContext.isGLES3Available(device, isHardwareRasterizer))) { return esCtxUndef ? GLES3 : GLContext.getAvailableGLProfileName(device, 3, GLContext.CTX_PROFILE_ES); } else if(GLES2 == profile && hasGLES3Impl && ( esCtxUndef || GLContext.isGLES2Available(device, isHardwareRasterizer))) { Sven, if you'd like me to provide a more robust solution, let me know. (In reply to Julien Gouesse from comment #46) Excellent Julien, Would it help in the meantime, to let my customers try my application with this latest jar file included, to see if the application now starts and if OpenGL works on their system? I guess it would? Mabula (In reply to Mabula Haverkamp from comment #47) Yes, it would help but it would be safer if I added an hint somewhere to indicate that it's a modified unofficial build, what's your opinion about that? (In reply to Julien Gouesse from comment #48) I would only release my application with this latest jar as a Linux Beta version. That would be fine for me. If you feel/think that you need to add a hint in JOGL before I release that beta, then I will wait for a jar with such a hint. Otherwise, I can build the Linux Beta today and send it to my 2 customers for further testing and let you know their findings. Let me know what you want to do, Mabula (In reply to Mabula Haverkamp from comment #49) I've just updated the JAR with a modified manifest attribute, it's ok now. Thank you for the feedback. (In reply to Julien Gouesse from comment #50) Thanks Julien, I have re-downloaded the jar and I have built a beta Linux version of Astro Pixel Processor with this jar and I have sent it to my customers to ask if APP starts now and if so, if the OpenGL image viewer is working correctly now. Will let you know what they report as soon as possible ;-) Mabula (In reply to Julien Gouesse from comment #50) Hi Julien, My first customer that has tested it is confirming that all is okay now on his systen. This is darius with the Turks chipset. My application starts normally without a fatal error and OpenGL 3 is working in my image viewer window: https://www.astropixelprocessor.com/community/linux/app-1-077-not-run-on-kubuntu-18-04-lts/paged/3/#post-10955 This is excellent news :-) ! Mabula (In reply to Julien Gouesse from comment #50) :-) the second customer with the REDWOOD chipset also confirms that all is okay now :-) So my application is now starting with OpenGL enabled and working perfectly on both the systems. The image viewer that I developed is working perfectly according to the 2 users, so all must be okay I would think ! Congratulations Julien and above all, thank you very much ! Please let me know when the next 2.4 RC is available with the fix, so I can include it in my next stable release. Mabula Hello You're welcome. Don't expect a release before at least 3 weeks. (In reply to Julien Gouesse from comment #54) Hi Julien, Just a small question, but important for me and my customers: Will it be okay to release a new version of my application with the latest jar (with a modified manifest attribute), or should I really wait for the official build? Thanks again, Mabula (In reply to Mabula Haverkamp from comment #55) You can use the patched fat JAR if you want. It's not the cleanest solution but if you can't wait, just do it. (In reply to Julien Gouesse from comment #56) Hi Julien, I have run into a problem with the latest patched jar on MacOS. I have released a beta version of my application with that jar for Windows, MacOS and Linux and on Linux and Windows all is working well, no problems have been reported. On my macbook with MacOS Sierra 10.12.16, the application will not start now with that jar. It does not give a fatal crash, but the OpenGL initialization seems to get stuck in some loop. The application startup stalls and my application is frozen... and thus won't start. Off course, i then used the old jar of JOGL-2.4-rc-20200307 and then all is okay on MacOS, so I included that jar in te macOS beta release for my customers. Below is an URL to a debug report on my MacBook for you to analyse, perhaps the new fix should not be applied on MacOS ? https://apastropixelprocessordl.s3.eu-central-1.amazonaws.com/debug_logs_MacOS-10.12.16.txt Thanks, Mabula (In reply to Mabula Haverkamp from comment #57) Thank you for the feedback. It was expected to work under OS X too. Sven will have a look at my fix. Do you reproduce this bug by running the usual test or does it occur only in your software? Hi Julien, I have tried to find the exact point where it goes wrong in my application. I have run my application with -Djogl.debug=all and I have included the final output which seems to indicate where everyhting stalls... everything is waiting for the GUI to popup, which will not happen: cf2 MaxOSXCGLContext.NSOpenGLImpl.associateDrawable: true, ctx 0x7fc571b27fb0, hasBackingLayerHost false, attachGLLayerCmd null\cf0 \ \cf2 MaxOSXCGLContext.NSOpenGLImpl.contextMadeCurrent: Cure missing CGLContextLock (Bug1398), surfaceLock <46f76027, 17115770>[count 1, qsz 0, owner <AWT-EventQueue-0>]\cf0 \ \cf2 MaxOSXCGLContext.NSOpenGLImpl.contextMadeCurrent: Wait for SetNSViewCmd (Bug1398), surfaceLock <46f76027, 17115770>[count 0, qsz 0, owner <NULL>]} I see a reference in there to Bug1398 ? and "Wait for SetNSViewCmd" Maybe this will explain it? I will include more output in attachment: Application-no-GUI-start.txt Mabula Created attachment 849 [details]
final lines of running application with -Djogl.debug=all
Please test the latest release candidate and confirm (or not) it solves your problem. (In reply to Julien Gouesse from comment #61) Hi Julien, Please accept my apologies for this very late reply on this issue. I have been using the jogl-2.4-RC-20210111 version in my application distribution for quite a while now and the Linux users that were having issues have comfirmed all is okay now ;-) and I have had no other similar bug reports since on Linux. I think several 100 Linux users use my application on a daily basis, so all seems to be good ! You can set this issue to solved I think ;-) Mabula (In reply to Mabula Haverkamp from comment #62) Better late than never :) Thank you for the feedback. It should be fixed by the commit edf0d33ba913fd37f9e6ce0a771d8dfb6fa962e6 It should be fixed by the commit edf0d33ba913fd37f9e6ce0a771d8dfb6fa962e6 See comment 65 Dear Sven & Julien, Thank you very much. Today, I have added the 2.4.0 build release 20230201 to my project. I was using the build from jzy3d for macOS arm support, but it did not contain this fix clearly, because the bug resurfaced for my Linux users. I will let you know if all is okay now with this build for my Linux users. Mabula Dear Sven & Julien, Thank you very much. Today, I have added the 2.4.0 build release 20230201 to my project. I was using the build from jzy3d for macOS arm support, but it did not contain this fix clearly, because the bug resurfaced for my Linux users. I will let you know if all is okay now with this build for my Linux users. Mabula |