Bug 1062

Summary: GNU/Linux armhf / i.mx6 (Vivante GC2000 blob) test crash in libGL's glGetString at EGL/ES probing/init
Product: [JogAmp] Jogl Reporter: Ilya Averyanov <i.averyanov>
Component: embeddedAssignee: Sven Gothel <sgothel>
Status: IN_PROGRESS ---    
Severity: major CC: cedric.koch-hofer, gouessej, i.averyanov, vladanjovanovic, xerxes
Priority: P1    
Version: tbd   
Hardware: embedded_arm   
OS: linux   
URL: http://forum.jogamp.org/i-mx6-crash-on-glGetString-td4033064.html
Type: DEFECT SCM Refs:
jogl 1e4bfc26e2f220e046f42f7d26c05e4971bc509d gluegen df51fbae035f266869963116bf83f2ab45ae6fec
Workaround: TRUE
Attachments: Output log
crash log
Output log
crash log

Description Ilya Averyanov 2014-09-05 15:38:50 CEST
Created attachment 630 [details]
Output log

In attach output log of etc/test_dbg.sh
and crash dump log
Comment 1 Ilya Averyanov 2014-09-05 15:39:27 CEST
Created attachment 631 [details]
crash log
Comment 2 Ilya Averyanov 2014-09-05 17:11:51 CEST
Created attachment 632 [details]
Output log
Comment 3 Ilya Averyanov 2014-09-05 17:12:13 CEST
Created attachment 633 [details]
crash log
Comment 4 Ilya Averyanov 2014-09-08 18:40:43 CEST
jogl make init bu using
Comment 5 Ilya Averyanov 2014-09-08 18:43:20 CEST
jogl make init using function by libEGL.so but call glGetString from libGL.so.
glGetString for libEGL.so plase on  libGLESv2.so
Comment 6 Sven Gothel 2014-10-09 07:15:26 CEST
(In reply to comment #5)
> jogl make init using function by libEGL.so but call glGetString from
> libGL.so.
> glGetString for libEGL.so plase on  libGLESv2.so

AFAIK, we use accurate function pointer handles 
acquired at ProcAddress initialization and passing them 
to the JNI library, which itself invokes the proper native function.

- main: GLContext EGL ProcAddressTable mapping key(EGL-.egl_decon_0) 
        -> 0x1c319b9
- Lookup-Tool: <glGetString> 0x5f3ed438

C  [libGL.so.1+0x36444]  glGetString+0xc

j  jogamp.opengl.GLContextImpl.glGetStringInt(IJ)Ljava/lang/String;+0
j  jogamp.opengl.GLContextImpl.initGLRendererAndGLVersionStrings()Z+46
j  jogamp.opengl.GLContextImpl.setGLFunctionAvailability(ZIIIZZ)Z+170
j  jogamp.opengl.egl.EGLContext.createImpl(J)Z+547
j  jogamp.opengl.GLContextImpl.makeCurrentWithinLock(I)I+200
j  jogamp.opengl.GLContextImpl.makeCurrent(Z)I+457
j  jogamp.opengl.GLContextImpl.makeCurrent()I+2
j  jogamp.opengl.egl.EGLDrawableFactory.mapAvailableEGLESConfig(Ljavax/media..

(^^^ from your logs)

Above shows that the native glGetString(..) call
indeed ends up in the libGL.so native library.

GLContext.initGLRendererAndGLVersionStrings(..) is called from EGL:
    final GLDynamicLookupHelper glDynLookupHelper =
                  getDrawableImpl().getGLDynamicLookupHelper();
    final long _glGetString = 
                  glDynLookupHelper.dynamicLookupFunction("glGetString");

The GLDynamicLookupHelper comes from a EGLDrawable,
i.e. EGLES2DynamicLibraryBundleInfo or EGLES1DynamicLibraryBundleInfo.

Both provide the toolkit-lookup function eglGetProcAddress 
which is used.

Hence I assume that the glGetString function pointer 
as returned by eglGetProcAddress is erroneous.

Maybe you can add instrumentation (printf)
showing where the actual pointer comes from,
even though - I guess the logs show that clearly.
But you may like to proof that eglGetProcAddress
returns the wrong address -> propagate issue to driver vendor.

Can you please add information about the driver vendor
and actual hardware in use ?

Also, please note: Desktop GL + EGL/[ES1 + ES2 + ES3] 
are working perfectly together when using correct drivers.
Examples:
  - Mesa GL2, GL3, ES2/ES3 (w/ Software, AMD, Intel)
  - NVidia GL2, GL3, ES2/ES3
  - ..

Thank you for your help - we need to proof that this is 
a vendor issue (upstream) or fix the bug in JOGL ofc.
Comment 7 Vladan 2015-01-20 14:06:22 CET
Reproduced the error on i.MX 6 SABRE SDB development board. Using Debian jessie armhf and Linux kernel from Yocto Daisy release for i.MX 6 SABRE SDB. jogl2 and gluegen2 come from Debian. Using openjdk-7-jre and openjdk-7-jdk.

Verified by small C program that eglGetProcAddress returns correct address for glGetString and that there's no issue with calling glGetString via this returned pointer.

I'll try compiling the Jogl and Gluegen next from Git. Any pointers on how to debug the problem are very welcome.
Comment 8 Vladan 2015-01-27 09:02:12 CET
It looks like eglCreateContext() is not called prior to calling glGetString(), which is an API requirement for OpenGL ES.

How could this be fixed?
Comment 9 Sven Gothel 2015-01-29 01:01:41 CET
(In reply to comment #8)
> It looks like eglCreateContext() is not called prior to calling
> glGetString(), which is an API requirement for OpenGL ES.
> 
> How could this be fixed?

I strongly doubt that ..

GLContextImpl.setGLFunctionAvailability(..) is only 
called after context creation _and_ making it current, of course.
Yes, eglCreateContext(..) in case of EGL/ES.

If - for some sloppy buggy reason - this is not the case,
pls show me the slippery hole. But I storngly doubt.
Comment 10 Sven Gothel 2015-01-29 01:02:15 CET
(In reply to comment #7)
> Reproduced the error on i.MX 6 SABRE SDB development board. Using Debian
> jessie armhf and Linux kernel from Yocto Daisy release for i.MX 6 SABRE SDB.
> jogl2 and gluegen2 come from Debian. Using openjdk-7-jre and openjdk-7-jdk.
> 
> Verified by small C program that eglGetProcAddress returns correct address
> for glGetString and that there's no issue with calling glGetString via this
> returned pointer.
> 

great, thank you!
Comment 11 Vladan 2015-02-23 14:27:45 CET
(In reply to comment #9)

> If - for some sloppy buggy reason - this is not the case,
> pls show me the slippery hole. But I storngly doubt.

Is there a way to force using OpenGL ES API at startup, even though OpenGL libraries are available? Or does JOGL need to be rebuilt from scratch and force usage of OpenGL ES?
Comment 12 Sven Gothel 2015-08-04 15:20:22 CEST
(In reply to comment #11)
> (In reply to comment #9)
> 
> > If - for some sloppy buggy reason - this is not the case,
> > pls show me the slippery hole. But I storngly doubt.
> 
> Is there a way to force using OpenGL ES API at startup, even though OpenGL
> libraries are available? Or does JOGL need to be rebuilt from scratch and
> force usage of OpenGL ES?

for the time being (debugging) you could simply move 
the desktop libGL out of the way .. (rename)

Well, I give it another try as well (reproducing the issue).
Comment 13 Sven Gothel 2015-08-04 15:27:04 CEST
Fix earmarked for 2.3.2 release <https://jogamp.org/wiki/index.php/SW_Tracking_Report_Objectives_for_the_release_2.3.2>
Comment 14 Sven Gothel 2015-08-05 15:15:22 CEST
Another user: <http://forum.jogamp.org/OneTriangleAWT-crashes-with-quot-glGetString-quot-message-td4035015.html>

OpenGL vendor string: Vivante Corporation
OpenGL renderer string: Vivante GC2000
OpenGL version string: 2.1 2.0.1

GC2000  used in i.MX6 Dual and Quad
according to <https://en.wikipedia.org/wiki/Vivante_Corporation>

"There are no plans on writing a new DRM/KMS driver kernel driver for the Vivante hardware, since Vivante previously put out their Linux kernel component under the GNU General Public License (GPL), instead of maintaining it as a proprietary blob. The free Gallium3D-style device driver etna_viv has surpassed Vivante's own proprietary user-space driver in some benchmarks.[citation needed] It supports Vivante's product line of GC400"

(thx Xerxes)

Hence I assume a Vivante driver blob for 
  - libGL ?
  - libGLES* ?
is being used

and somewhat has issues w/ the counterpart, i.e.
  - Vivante libGLES* <-> Mesa libGL

etc ..

+++

Hence, pls try using 'etna_viv' open source driver!
Comment 15 Sven Gothel 2015-08-05 16:39:22 CEST
jogl 1e4bfc26e2f220e046f42f7d26c05e4971bc509d
    Utilize 'GLProfile.disableOpenGLDesktop' 
    for EGLDrawableFactory desktop mapping as well.
    
    Commit 35622a7cef4a28ce7e32bf008ef331d9a0d9e3e2 introduced
    GLProfile.disableOpenGLDesktop,
    as enabled by system property 'jogl.disable.opengldesktop'.
    
    Desktop OpenGL shall also be disabled within EGLDrawableFactory.
    
    Provide verbose DEBUG info for all disabled desktop OpenGL cases.

Hence one shall apply a workaround, i.e. either disable desktop 
or embedded OpenGL via defining system property:
  - jogl.disable.opengldesktop
  - jogl.disable.opengles

+++
  
One can also disable desktop OpenGL by just using the 
'jogl.all-mobile.jar'.

+++

Another field to investigate is 
  - DynamicLibraryBundleInfo's shallLinkGlobal() / shallLookupGlobal() 

Note shallLinkGlobal() is 'true' for OpenGL ..

+++

I have revalidated that glGetString symbols comes from the 
eglGetProcAddress tool-lookup function, which itself comes
from libEGL - which is correct.

+++

Closing this bug 'WORKSFORME', since we have no such system to reproduce.
In case this changes - or a remedy is known - please reopen this bug!
Comment 16 Sven Gothel 2015-08-05 16:44:21 CEST
gluegen df51fbae035f266869963116bf83f2ab45ae6fec
    DynamicLibraryBundle.toolDynamicLookupFunction(..): DEBUG: 
    Show 'toolGetProcAddressHandle'  (Bug 1062)
    
    Show 'toolGetProcAddressHandle' in DEBUG mode in
    DynamicLibraryBundle.toolDynamicLookupFunction(..),
    allowing to validate source of symbols.

^^ used to manually validate source of glGetString
Comment 17 Xerxes Rånby 2015-09-27 08:36:39 CEST
*** Bug 1121 has been marked as a duplicate of this bug. ***
Comment 18 Xerxes Rånby 2015-09-27 10:36:58 CEST
moved investigation this bug to 2.4.0 release

we must know if this bug is caused by the Vivante binary GPU driver gets mixed up with the Mesa3D shared librarys if both GPU drivers are installed on the same system.

we need to know if the bug can be fixed by only using one of the two GPU drivers at a given moment.

we need to know if using the etnaviv free software GPU is an acceptable solution.