Bug 917

Summary:	Improved Buffer Object Extension Checks Can Break Custom GL Subclasses
Product:	[JogAmp] Jogl	Reporter:	Rob Hatcherson <rob.hatcherson>
Component:	opengl	Assignee:	Sven Gothel <sgothel>
Status:	RESOLVED INVALID
Severity:	normal	CC:	gouessej
Priority:	---
Version:	2
Hardware:	All
OS:	all
Type:	---	SCM Refs:
Workaround:	---

Description Rob Hatcherson 2013-12-02 19:25:30 CET

Background: Right or wrong, we use a custom subclass of GL4bcImpl to intercept calls that would affect the modelview matrix so we can do the computations in double-precision. This eliminates "jiggles" when dealing with e.g. earth-centered coordinate systems, and as an added bonus highlights the power of being able to "slide in" a special GL to do custom stuff. Put another way, we don't consider this to be an abuse of the capability, but rather consider it taking advantage of this level of indirection to add functionality much like DebugGL2 takes advantage of it to add additional error checking. (Note: as it turns out this worked remarkably well and allowed us to solve this double-precision problem with minimal intrustion in existing code. The only limitation was that we couldn't do any modelview manipulation in display lists because it skirted around the model call intercepts, but that was no big loss).

Observed: when we switched from RC12 to v2.1.2 our VBOs stopped working. I investigated and discovered that in RC12 there was a method in GL4bcImpl named initBufferObjectExtensionChecks that checked for buffer object feature availability among other things. This method was called every pass and consulted a boolean flag to see if it had ever run through once. If the flag was true then it just returned.

This probably was observed to be an inefficiency (and it is - though possibly not a large one ; however as the sayings go "small strokes fell great oaks", "death by 1000 cuts", etc), so somewhere between RC12 and 2.1.2 a method named "finalizeInit" was introduced in GL4bcImpl (and probably in the other GLs too) and GLContextImpl was modified to hunt for this method using ReflectionUtil and invoke it via reflection if present. The advantage was that the number of calls to determine buffer object availbility was greatly reduced.

Unfortunately ReflectionUtil uses getDeclaredMethod to see if the finalizeInit method is implemented, and getDeclaredMethod only finds methods implemented immediately within the class given to it as an argument. Our custom subclass of GL4bcImpl does not implement finalizeInit - and cannot, because finalizeInit is declared final - so ReflectionUtil fails to find it and therefore the method is never called. As a result all the buffer object capability flags remain false and things start failing downstream because capabilities are believed to be missing.

There is not a clean and convenient workaround for this. However, we managed to hack it by calling finalizeInit ourselves through reflection using a hunt-up-the-class-chain approach to find the first implementation in the argument class or any superclass of it then call that one. We do this one time on the first call to "display". This is not an all-purpose hack, but it appears to work OK for the affected project.

I'd like to suggest one of two fixes for this.

1) Preferred: modify ReflectionUtil to provide an additional method for hunting down methods not only in the class it's handed, but if the method is not found there then visit the superclasses in order up the chain until it finds one. Then have GLContextImpl use that method instead to find finalizeInit. With this approach our original code would Just Work.

2) Less desirable: remove the "final" declaration from finalizeInit so a subclass can override it and call super's implementation. This caters to ReflectionUtil's usage of getDeclaredMethod, but IMO is less desirable because now GL subclasses need to know they have to implement the override and if they don't then the reasons things are failing is not obvious.

Discussion?

Comment 1 Sven Gothel 2013-12-04 04:09:37 CET

(In reply to comment #0)
> Background: Right or wrong, we use a custom subclass of GL4bcImpl to
> intercept calls that would affect the modelview matrix so we can do the
> computations in double-precision.  

This is very _wrong_ !

- GL*Impl are not public (hence private package)

- You shall use GL pipelining like the DebugGL* and TraceGL* pipeline.
  You can create such via the 'BuildComposablePipeline'.
  Look at ant target 'java.generate.composable.pipeline.custom.glfixfunc'
  within 'jogl/make/build-jogl.xml'
  This produces FixedFuncImpl, which is used for our fixed function emulation.

> This eliminates "jiggles" when dealing
> with e.g. earth-centered coordinate systems, and as an added bonus
> highlights the power of being able to "slide in" a special GL to do custom
> stuff.  Put another way, we don't consider this to be an abuse of the
> capability, but rather consider it taking advantage of this level of
> indirection to add functionality much like DebugGL2 takes advantage of it to
> add additional error checking.  (Note: as it turns out this worked
> remarkably well and allowed us to solve this double-precision problem with
> minimal intrustion in existing code.  The only limitation was that we
> couldn't do any modelview manipulation in display lists because it skirted
> around the model call intercepts, but that was no big loss).
> 
> Observed: when we switched from RC12 to v2.1.2 our VBOs stopped working.  I
> investigated and discovered that in RC12 there was a method in GL4bcImpl
> named initBufferObjectExtensionChecks that checked for buffer object feature
> availability among other things.  This method was called every pass and
> consulted a boolean flag to see if it had ever run through once.  If the
> flag was true then it just returned.
> 
> This probably was observed to be an inefficiency (and it is - though
> possibly not a large one ; however as the sayings go "small strokes fell
> great oaks", "death by 1000 cuts", etc), so somewhere between RC12 and 2.1.2
> a method named "finalizeInit" was introduced in GL4bcImpl (and probably in
> the other GLs too) and GLContextImpl was modified to hunt for this method
> using ReflectionUtil and invoke it via reflection if present.  The advantage
> was that the number of calls to determine buffer object availbility was
> greatly reduced.

Yes, and JOGL doesn't specify a public interface / protocol for the private *Impl
classes - hence we don't care about the oaks here :)

> 
> Unfortunately ReflectionUtil uses getDeclaredMethod to see if the
> finalizeInit method is implemented, and getDeclaredMethod only finds methods
> implemented immediately within the class given to it as an argument.  Our
> custom subclass of GL4bcImpl does not implement finalizeInit - and cannot,
> because finalizeInit is declared final - so ReflectionUtil fails to find it
> and therefore the method is never called.  As a result all the buffer object
> capability flags remain false and things start failing downstream because
> capabilities are believed to be missing.
> 
> There is not a clean and convenient workaround for this.  However, we
> managed to hack it by calling finalizeInit ourselves through reflection
> using a hunt-up-the-class-chain approach to find the first implementation in
> the argument class or any superclass of it then call that one.  We do this
> one time on the first call to "display".  This is not an all-purpose hack,
> but it appears to work OK for the affected project.
> 
> 
> I'd like to suggest one of two fixes for this.
> 
> 1) Preferred: modify ReflectionUtil to provide an additional method for
> hunting down methods not only in the class it's handed, but if the method is
> not found there then visit the superclasses in order up the chain until it
> finds one.  Then have GLContextImpl use that method instead to find
> finalizeInit.  With this approach our original code would Just Work.
> 
> 2) Less desirable: remove the "final" declaration from finalizeInit so a
> subclass can override it and call super's implementation.  This caters to
> ReflectionUtil's usage of getDeclaredMethod, but IMO is less desirable
> because now GL subclasses need to know they have to implement the override
> and if they don't then the reasons things are failing is not obvious.
> 
> 
> Discussion?

- Change your implementation in a way to use your own 
  matrix implementation, see public interface: GLMatrixFunc (fixed function matrix GL subset)

  BTW .. we have our PMVMatrix implementing GLMatrixFunc
  providing old style matrix ops to ES2, GL3-core applications.

  Yes, this will introduce some sort of handling in your user application,
  i.e. using a GLMatrixFunc for matrix ops .. however - it's a fast and clean cut.

- Use the pipeline method as described above....
  You may use your own impl. of GLMatrixFunc as described above,
  instead of using the default GL fixed function matrix impl.
  See our fixed function emulation around 'FixedFuncImpl'.

At some point, we have to use 'private space' where we change implementation 
specifics. The private package namespace is exactly where no user shall assume 
things to stay the same.

I close this report for now.
If you think it is not possible to live w/ such restrictions and alternatives,
pls reopen and we may continue this discussion about other alternatives.

~Sven

Comment 2 Rob Hatcherson 2013-12-04 13:57:43 CET

Heh heh - OK; I will slither off and reconsider the approach :-).

I'll admit that I felt kind of dirty all over after making that change, but it was seductive because it fit so *perfectly* without disturbing the rest of the code base.  To the point really that if the approved alternative is far more disruptive (maybe it won't be) then perhaps we ought to consider an official way to do what I was attempting here.

OpenGL is kind of lame in the area of double-precision anyway, advertising API that exhibits awful performance and/or artifacts such as the previously described "jiggling".  It would be hard to get rid of all of those problems, but it was at least nice to able to intercept the relatively small set of calls that can affect the modelview matrix just by sliding in a different GL and the rest of the code base didn't even know it was happening.  That was very smooth, and something C-based OpenGL apps can't do easily.

Anyways, I will investigate and I probably will at least tag my observations onto the end of this bug and/or the forums just in case somebody has the same problem and bumps into this discussion.

As usual, thanks for the comments and all the excellent work on JogAmp!

Comment 3 Rob Hatcherson 2013-12-04 15:43:04 CET

Let me revise a previous statement: I said OpenGL is lame with regard to double precision, but I should have said OpenGL as implemented by nvidia is lame with regard to double precision.  Other vendors' implementations may have similar limitations.  At least the OpenGL spec writers had the foresight to anticipate double arguments.

Comment 4 Rob Hatcherson 2013-12-04 19:38:39 CET

Sven, please forgive me if I'm missing something here:

After looking at this briefly it looks like anything related to GLMatrixFunc won't be sufficient because it only covers the float method variants (?).

What I need is for the glTranslated/glRotated/glScaled/and all the other "d" variants of functions that can affect the modelview matrix to not only accept double arguments but to also do the math in double precision.

The composable pipeline is headed in the right direction.  Our heinous hack is doing the same thing in spirit i.e. providing a custom implementation of the double variants of the float methods currently wrapped up by GLMatrixFunc and hiding them behind a GL* interface.

I can get around this using public API if I expose the double matrix impl that currently is hidden from view by the hack GL subclass and change all our graphics code to make the modelview calls through that object instead.   But... that's going to affect pretty much every graphics-oriented class in the system.  The heinous hack at least had the virtue of being transparent.

Comment 5 Julien Gouesse 2013-12-23 16:01:38 CET

The problem is that GLMatrixFunc is a subset of functions from the fixed function pipeline supported both in desktop and mobile environments.

Have you found another (clean) solution in the meantime?

Comment 6 Rob Hatcherson 2013-12-23 17:30:33 CET

Hi Julien,

> The problem is that GLMatrixFunc is a subset of functions from the
> fixed function pipeline supported both in desktop and mobile
> environments.

Ah... ok, I see.

> Have you found another (clean) solution in the meantime?

No luck so far.

After the discussion with Sven I looked at the GLMatrixFunc interface but had to back away from it because it only covers the float API for the reason you have mentioned.  So... in the interests of trying make things "legal" I went in and redirected all the calls that could affect the modelview with identical API in the class I have that implements those functions in double precision.  This eventually worked, but only after changing every file that performed GL operations and also locally reimplementing functions from central utility libs we have that weren't aware I was needing to do this.

After I got it all propped back up I took a step back from that and thought to myself "what a mess", so I discarded those changes and went back to my hack for the time being.

This is not the right way to do it for sure, and I agree 100% that the composable pipeline is conceptually on target.  At the moment I'm banking on a cleaner answer somewhere in JOGL's future that will cover the double-precision API too.  The reason for sticking with the evil hack for now is that if such a fix were to appear it would be far easier to reverse the single hack than reverse changes in all the files that were affected by handling this problem in an API-legal way.  Plus we only have one project at the moment that is affected by this, so the problem is isolated.

I considered diving into the JOGL code base and seeing if I could do something about it, but was initially stopped by the fact that my fork on github is hopelessly messed up right now due to my git noob-ness combined with having other problems to deal with.  In retrospect it looks like it's a good thing that I didn't pursue this since I was unaware of the motivation for GLMatrixFunc only covering the float functions - thanks for clearing that up.

Would it be possible to invent another interface that combines GLMatrixFuncs as it stands today with the remaining API?  E.g. (and notionally) GLAllMatrixFuncs extends GLMatrixFuncs and add the extra methods?  Not sure though how that would (or wouldn't) fit into the composable pipeline stuff.  If one were to do that though it might make more sense naming-wise if GLMatrixFuncs included everything, and the subset was called something else that hinted at why it is a subset (e.g. GLMatrixFuncsCommonToAllPlatforms except come up with a shorter name :-) ).

Perhaps we should at least morph this flawed bug report into an enhancement "bug" that suggests that the additional API should be covered.  Granted the need is not as common as with the float API, but if you're in a situation that requires doubles it sure would be nice to have a way to insert an alternate implementation.

Comment 7 Sven Gothel 2013-12-28 01:14:58 CET

(In reply to comment #6)
> Hi Julien,
> 
> > The problem is that GLMatrixFunc is a subset of functions from the
> > fixed function pipeline supported both in desktop and mobile
> > environments.
> 
> Ah... ok, I see.
> 
> > Have you found another (clean) solution in the meantime?
> 
> No luck so far.
> 
> After the discussion with Sven I looked at the GLMatrixFunc interface but
> had to back away from it because it only covers the float API for the reason
> you have mentioned.  So... in the interests of trying make things "legal" I
> went in and redirected all the calls that could affect the modelview with
> identical API in the class I have that implements those functions in double
> precision.  This eventually worked, but only after changing every file that
> performed GL operations and also locally reimplementing functions from
> central utility libs we have that weren't aware I was needing to do this.
> 
> After I got it all propped back up I took a step back from that and thought
> to myself "what a mess", so I discarded those changes and went back to my
> hack for the time being.
> 
> This is not the right way to do it for sure, and I agree 100% that the
> composable pipeline is conceptually on target.  At the moment I'm banking on
> a cleaner answer somewhere in JOGL's future that will cover the
> double-precision API too.  The reason for sticking with the evil hack for
> now is that if such a fix were to appear it would be far easier to reverse
> the single hack than reverse changes in all the files that were affected by
> handling this problem in an API-legal way.  Plus we only have one project at
> the moment that is affected by this, so the problem is isolated.
> 
> I considered diving into the JOGL code base and seeing if I could do
> something about it, but was initially stopped by the fact that my fork on
> github is hopelessly messed up right now due to my git noob-ness combined
> with having other problems to deal with.  In retrospect it looks like it's a
> good thing that I didn't pursue this since I was unaware of the motivation
> for GLMatrixFunc only covering the float functions - thanks for clearing
> that up.
> 
> Would it be possible to invent another interface that combines GLMatrixFuncs
> as it stands today with the remaining API?  E.g. (and notionally)
> GLAllMatrixFuncs extends GLMatrixFuncs and add the extra methods?  Not sure
> though how that would (or wouldn't) fit into the composable pipeline stuff. 
> If one were to do that though it might make more sense naming-wise if
> GLMatrixFuncs included everything, and the subset was called something else
> that hinted at why it is a subset (e.g. GLMatrixFuncsCommonToAllPlatforms
> except come up with a shorter name :-) ).
> 
> Perhaps we should at least morph this flawed bug report into an enhancement
> "bug" that suggests that the additional API should be covered.  Granted the
> need is not as common as with the float API, but if you're in a situation
> that requires doubles it sure would be nice to have a way to insert an
> alternate implementation.

So in the end, either we change the GL interfaces to abstract the double precision 
fixed function pipeline - or it doesn't matter whether you maintain the 'pipeline' 
or the 'unsupported' *Impl override - I may agree.

The 'unsupported' *Impl override, is ofc faster for your case and while you have to 
maintain it for a specific JOGL version it may not differ that much to maintaining the pipeline.
This is surely your choice.

As you pointed out, only if we do have a good reason for a change in the *Impl class,
we apply such optimizations, otherwise we surely refrain from masochism :)

Changing the GL interfaces is quite a hard choice to do, if it breaks compatibility.
So this may only happen for a minor release update, i.e. 2.2.*

Not even sure whether adding an interface will cause compatibility issues,
i.e. adding a GLMatrixFuncsDouble in parallel to GLMatrixFuncs.

Please consider the fixed function pipeline (FFP) as 'not so important' or a big priority 
since most attention is given to the core profiles today.
Hence changing things in this regard maybe a bit 'over the top'.

A pipeline enhancement on the other hand is something I surely would consider useful,
i.e. adding a NopGL* pipeline, which you could override - or something similar.