Bug 989

Summary: CLProgramTest.programBinariesTest and LowLevelBindingTest.lowLevelVectorAddTest cause SIGSEGV or EXCEPTION_ACCESS_VIOLATION on AMD cards
Product: [JogAmp] Jocl Reporter: Wade Walker <wwalker3>
Component: openclAssignee: Wade Walker <wwalker3>
Status: RESOLVED FIXED    
Severity: normal CC: sgothel
Priority: ---    
Version: 1   
Hardware: All   
OS: all   
Type: --- SCM Refs:
Workaround: ---

Description Wade Walker 2014-03-02 16:50:39 CET
Here are sample stacks from Linux and Windows:

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libamdocl64.so+0x42664e]  clCreateKernelsInProgram+0xae

[error occurred during error reporting (printing native stack), id 0xb]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  com.jogamp.opencl.llb.impl.CLAbstractImpl.dispatch_clCreateKernelsInProgram0(JILjava/lang/Object;ILjava/lang/Object;IJ)I+0
j  com.jogamp.opencl.llb.impl.CLAbstractImpl.clCreateKernelsInProgram(JILcom/jogamp/common/nio/PointerBuffer;Ljava/nio/IntBuffer;)I+104
j  com.jogamp.opencl.CLProgram.createCLKernels()Ljava/util/Map;+40
j  com.jogamp.opencl.CLProgramTest.programBinariesTest()V+334


Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [amdocl64.dll+0x16f454]
j  com.jogamp.opencl.llb.impl.CLAbstractImpl.clCreateKernelsInProgram(JILcom/jogamp/common/nio/PointerBuffer;Ljava/nio/IntBuffer;)I+104
j  com.jogamp.opencl.CLProgram.createCLKernels()Ljava/util/Map;+40
j  com.jogamp.opencl.CLProgramTest.programBinariesTest()V+327
v  ~StubRoutines::call_stub
V  [jvm.dll+0x1cb0c3]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  com.jogamp.opencl.llb.impl.CLAbstractImpl.dispatch_clCreateKernelsInProgram0(JILjava/lang/Object;ILjava/lang/Object;IJ)I+0
j  com.jogamp.opencl.llb.impl.CLAbstractImpl.clCreateKernelsInProgram(JILcom/jogamp/common/nio/PointerBuffer;Ljava/nio/IntBuffer;)I+104
j  com.jogamp.opencl.CLProgram.createCLKernels()Ljava/util/Map;+40
j  com.jogamp.opencl.CLProgramTest.programBinariesTest()V+327
Comment 1 Wade Walker 2014-03-02 16:51:43 CET
Bought a used Radeon HD 5450 at a thrift shop yesterday, will try to duplicate these at home and debug.
Comment 2 Wade Walker 2014-03-08 17:17:49 CET
Duplicated these errors on Win 8.1 with Radeon HD 5450, debugging now.
Comment 3 Wade Walker 2014-03-08 23:13:41 CET
programBinariesTest() failure was due to AMD drivers crashing in clCreateKernelsInProgram() when the program is not built yet, instead of returning error code CL_INVALID_PROGRAM_EXECUTABLE as they should.

lowLevelVectorAddTest() failure was apparently due to the AMD drivers writing past the end of a direct byte buffer in such a way that it made System.gc() crash when called during teardown (this crash didn't even dump stack). Making the buffer larger solved the problem.
Comment 4 Wade Walker 2014-03-08 23:42:31 CET
Fixes are at https://github.com/WadeWalker/jocl/tree/bug_989_fix_AMD_driver_crashes, waiting to merge until build server is back up
Comment 5 Wade Walker 2014-04-13 23:10:14 CEST
lowLevelVectorAddTest() still fails on linux-x86_64-amd and win7-x86_32-amd; will try increasing buffer size further and rewinding it between all reuses.
Comment 6 Wade Walker 2014-04-13 23:29:53 CEST
That fixed it.