Bug 623

Summary: Sporadic XCB assertion failures w/ ATI proprietary driver and w/o native X11 locking
Product: [JogAmp] Jogl Reporter: Sven Gothel <sgothel>
Component: x11Assignee: Sven Gothel <sgothel>
Status: RESOLVED WONTFIX    
Severity: critical CC: alexander.lex, andres.colubri, bforrester, dp, gouessej, wwalker3, xerxes
Priority: ---    
Version: 2   
Hardware: pc_all   
OS: linux   
Type: --- SCM Refs:
jogl 43891be36e8485353ac74f329fd2f7438303a846 jogl 92398025abdabb2fdef0d78edd41e730991a6f94 jogl 2d35fc546097818aba5db51885f796cb0b442734 jogl 0790108ca9c2a6a6d494e5017589fe083c518e23 jogl 7c333e3e2574879465719ed968112b27255368d4 jogl 9c96be7c0a1a19365ae983908260c6ff44f045c4
Workaround: TRUE
Bug Depends on:    
Bug Blocks: 603, 616    

Description Sven Gothel 2012-09-29 13:03:00 CEST
W/o native X11 locking (i.e. w/o XInitThreads(), XLockDisplay()/XUnlockDisplay())
XCB assertions fail sporadically w/ ATI proprietary driver.

The assertions appear sometimes at 'glXCreateContextAttribsARB(..)'
and at 'X11Lib.XineramaIsEnabled()' if commit 43891be36e8485353ac74f329fd2f7438303a846 is disabled.
  <http://jogamp.org/git/?p=jogl.git;a=commit;h=43891be36e8485353ac74f329fd2f7438303a846>

The latter is pretty much regulary triggered w/ unit test 
  com.jogamp.opengl.test.junit.jogl.acore.TestGLContextDrawableSwitchNEWT

Blocks Bug 616 to become stable.
Comment 1 Sven Gothel 2012-09-29 13:05:10 CEST
JOGL commit 43891be36e8485353ac74f329fd2f7438303a846
relaxes usage of X11Lib.XineramaIsEnabled() and hence the chance to trigger the XCB assertion failure.

<http://jogamp.org/git/?p=jogl.git;a=commit;h=43891be36e8485353ac74f329fd2f7438303a846>
Comment 2 Sven Gothel 2012-09-30 20:56:25 CEST
Fix seems to be impossible, since root cause probably is within ATI proprietary driver.

Workaround is to use a global lock as fallback in case ATI proprietary driver is being used.

The proprietary ATI X11 driver does not handle multi-threaded [GL] clients well,
i.e. triggers an XCB assertion 'from time to time'.
    
It almost seems like that the driver either:
  - aliases all display connections to it's connection name, i.e. server; or
  - utilizes a build-in display connection w/o locking, used for some reason

See details here:
  <http://jogamp.org/git/?p=jogl.git;a=commit;h=92398025abdabb2fdef0d78edd41e730991a6f94>
  <http://jogamp.org/git/?p=jogl.git;a=commit;h=2d35fc546097818aba5db51885f796cb0b442734>
  <http://jogamp.org/git/?p=jogl.git;a=commit;h=0790108ca9c2a6a6d494e5017589fe083c518e23>
Comment 3 Sven Gothel 2012-10-02 07:42:39 CEST
Reopened .. workaround is not feasible:

Relax Bug 613 workaround of commit 92398025abdabb2fdef0d78edd41e730991a6f94
    
Utilizing a GlobalToolkitLock in general to lock the display connection results in deadlock
situations where locked surfaces signal other [offscreen] surfaces to render.
    
We have to see whether we find a better solution, for now sporadic XCB assertion still happen.
But it is preferrable to point to the root cause, then to jumping through hoops to complicate locking
or even to deadlock.
Comment 4 Sven Gothel 2012-10-02 17:32:26 CEST
NativeWindowFactory: Remove 'remedy' of Bug 613 
Commit 2398025abdabb2fdef0d78edd41e730991a6f94 GlobalToolkitLock for create/destroy
    
Turns out on it has no effect and ATI prop. driver still has XCB failures at this point.

+++

Well, I give this bug a rest for now (WONTFIX) until further notice and knowledge.

Summary is that even w/ rolled-back changes (XInitThreads()) the prop. ATI driver
is not intrinsically thread safe regarding to the XCB assertion failures.

The new test non blocking test TestInitConcurrent02NEWT triggers it w/ or w/o 
native X11 locking. Non blocking, since it utilizes 2 separate exclusive X11 display connections
for each NEWT window, one for X11 event pump and one for rendering.
Doing this 16 times in parallel, where we create 32 threads:
  - Each window-pump and window-render has it's own thread

Tested w/ latest Ubuntu 12.04.1 LTS w/ backports enabled w/ KDE
and amd-driver-installer-12-8-x86.x86_64 and amd-driver-installer-12-9-beta-x86.x86_64.

Also validated w/ Mesa R600 Gallium git tip from 2012-10-02,
git-sha1 523c01524638b3d1bb363f4c0a647b0777840b7a.
No XCB threading issue exist here - however, other issues exist like
  - Shared Context Teats: "radeon: The kernel rejected CS"
  - No FBO MSAA (all Mesa hw-accel driver, i.e. intel gallium)

+++

Best 'stable' ATI proprietary config so far:
  Using KDE w/ compositing (dekstop effects) enabled.