Bug 715

Summary: glGetIntegerv Can Possibly Pass A Bad Pointer To ReleasePrimitiveArrayCritical
Product: [JogAmp] Jogl Reporter: Rob Hatcherson <rob.hatcherson>
Component: coreAssignee: Sven Gothel <sgothel>
Status: RESOLVED FIXED    
Severity: normal CC: sgothel
Priority: ---    
Version: 2   
Hardware: All   
OS: all   
Type: --- SCM Refs:
47333929fd4e563d61996654d2a435b52b904ef0 86f5e7eac7544d2511b70c2142634c89c69d0594
Workaround: ---

Description Rob Hatcherson 2013-04-09 17:28:20 CEST
This is more like a "potential bug", so many apologies if this ends up being nothing to worry about.  I wanted to get a reading from the gurus before I start changing stuff, and this seemed deeper than forum discussion.

This discussion is based on RC11 built on 02/06/2013.  I can dig up better source identification if we need it.

I've run into a situation where if I call glGetIntegerv it will cause the GC to crash.  I'm still not sure why yet, but I can reliably get a GC crash with that call in there and the app is solid without it, so that call seems to be the smoking gun.

While investigating I ended up looking at the JNI impls behind the glGetIntegerv call.  For example, this one from .../build/jogl/gensrc/native/jogl/gl4/GL4bcImpl_JNI.c:

 8183 JNIEXPORT void JNICALL
 8184 Java_jogamp_opengl_gl4_GL4bcImpl_dispatch_1glGetIntegerv1__ILjava_lang_Object_2IZJ(JNIEnv *env, jobject _unused, jint pname, jobject params, jint params_byte_offset, jboolean params_is_nio, jlong procAddress) {
 8185   typedef void (APIENTRY*_local_PFNGLGETINTEGERVPROC)(GLenum pname, GLint *  params);
 8186   _local_PFNGLGETINTEGERVPROC ptr_glGetIntegerv;
 8187   GLint * _params_ptr = NULL;
 8188   if ( NULL != params ) {
 8189     _params_ptr = (GLint *) (((char*) ( JNI_TRUE == params_is_nio ?  (*env)->GetDirectBufferAddress(env, params) :  (*env)->GetPrimitiveArrayCritical(env, params, NULL) ) ) + params_byte_offset);
 8190   }
 8191   ptr_glGetIntegerv = (_local_PFNGLGETINTEGERVPROC) (intptr_t) procAddress;
 8192   assert(ptr_glGetIntegerv != NULL);
 8193   (* ptr_glGetIntegerv) ((GLenum) pname, (GLint *) _params_ptr);
 8194   if ( JNI_FALSE == params_is_nio && NULL != params ) {
 8195     (*env)->ReleasePrimitiveArrayCritical(env, params, _params_ptr, 0);  }
 8196 }


I don't think this has anything to do with my GC crash because I'm using a zero offset, but it looks like the _params_ptr passed to ReleasePrimitiveArrayCritical will be incorrect if params_byte_offset is not zero.

?

The byte offset is added to _params_ptr unconditionally at line 8189, so _params_ptr won't be the same pointer returned by GetPrimitiveArrayCritical.

Might it be better to either keep a different pointer to send in the call at line 8193, or... just add in the offset in the call at line 8193 rather than adding it in at line 8189?  Then the pointer sent to ReleasePrimitiveArrayCritical will be the same as that returned by GetPrimitiveArrayCritical.

If this is a problem it may affect many areas in addition to the one example cited above.
Comment 1 Rob Hatcherson 2013-04-09 19:49:13 CEST
FWIW the GC crash ended up being a too-short array being passed to glGetIntegerv for the parameter of interest.
Comment 2 Sven Gothel 2013-04-09 21:21:19 CEST
(In reply to comment #0)
>  8189     _params_ptr = (GLint *) (((char*) ( JNI_TRUE == params_is_nio ? 
> (*env)->GetDirectBufferAddress(env, params) : 
> (*env)->GetPrimitiveArrayCritical(env, params, NULL) ) ) +
> params_byte_offset);
>  8195     (*env)->ReleasePrimitiveArrayCritical(env, params, _params_ptr,
> 
> The byte offset is added to _params_ptr unconditionally at line 8189, so
> _params_ptr won't be the same pointer returned by GetPrimitiveArrayCritical.
> 
> Might it be better to either keep a different pointer to send in the call at
> line 8193, or... just add in the offset in the call at line 8193 rather than
> adding it in at line 8189?  Then the pointer sent to
> ReleasePrimitiveArrayCritical will be the same as that returned by
> GetPrimitiveArrayCritical.
> 
> If this is a problem it may affect many areas in addition to the one example
> cited above.

.. indeed fishy, I will double check the spec and add a unit test for this.
As you pointed out, since this is our gluegen produced code,
any similar glue would fail if this is would be wrong,
or the JVM we used to test simple was nice and we lucky :)

Thank you for your very thorough analysis!
Comment 3 Sven Gothel 2013-04-09 21:22:02 CEST
(In reply to comment #1)
> FWIW the GC crash ended up being a too-short array being passed to
> glGetIntegerv for the parameter of interest.

Great - thx! Will analyze the suspicious glue code later, hence keeping this bug open.
Comment 4 Rob Hatcherson 2013-04-11 20:08:35 CEST
You're always welcome for the input.  The more eyes on the product the better.

To your point, today we encountered another crash somewhere downstream of glMultiDrawElements in a memcpy call.  No clue why yet - today was the first occurrence.  While investigating I noticed a similar issue in the glMultiDrawElements JNI impl.

If it ends up being a problem it obviously would only manifest in cases where buffers with non-zero offsets/positions are being used.  This is not the case in the stuff I'm working on currently, and maybe existing unit tests use zero offsets too and that's why they're not failing.

Probably need to make a comprehensive sweep of all the ReleasePrimitiveArray... JNI calls and make sure the pointers they're getting are good under all conditions.
Comment 5 Sven Gothel 2013-04-13 20:59:54 CEST
http://docs.oracle.com/javase/1.5.0/docs/guide/jni/spec/functions.html#ReleasePrimitiveArrayCritical

Spec doesn't explicitly allow 'moving' the 'carray' pinned pointer within the mapped range,
hence I agree and consider this a bug, even though it has not triggered any erroneous 
behavior [yet].

I add a unit test in GlueGen to test this behavior a bit more in detail, then fix the issue
in our generator.
Comment 6 Sven Gothel 2013-04-13 21:25:09 CEST
(In reply to comment #5)
> Spec doesn't explicitly allow 'moving' the 'carray' pinned pointer within
> the mapped range,
> hence I agree and consider this a bug, even though it has not triggered any
> erroneous 
> behavior [yet].

Turns out Hotspot's ReleasePrimitiveArrayCritical is a NOP :)
Comment 7 Sven Gothel 2013-04-13 23:05:29 CEST
Fixed & unit test added as well.
Comment 8 Sven Gothel 2013-04-13 23:06:15 CEST
(In reply to comment #7)
> Fixed & unit test added as well.

.. see SCM Refs .. (GlueGen)
Comment 9 Rob Hatcherson 2013-04-14 19:51:08 CEST
(In reply to comment #6)
> Turns out Hotspot's ReleasePrimitiveArrayCritical is a NOP :)

And a very critical one, no doubt.  Heh...
Comment 10 Sven Gothel 2013-04-14 23:19:01 CEST
(In reply to comment #9)
> (In reply to comment #6)
> > Turns out Hotspot's ReleasePrimitiveArrayCritical is a NOP :)
> 
> And a very critical one, no doubt.  Heh...

:)

Yes, it does remove the critical thread state etc, but does nothing w/ the 'carray' argument etc.

However, it's fixed now - so we shall have no problems w/ other JNI implementations
which act different. Indeed, a very important fix - serious.