Bug 979

Summary: CLBuffer.createSubBuffer(..) does not consider CL_DEVICE_MEM_BASE_ADDR_ALIGN
Product: [JogAmp] Jocl Reporter: Sven Gothel <sgothel>
Component: openclAssignee: Sven Gothel <sgothel>
Status: RESOLVED FIXED    
Severity: major CC: wwalker3
Priority: ---    
Version: 1   
Hardware: All   
OS: all   
Type: --- SCM Refs:
d4f04ddd3ef3b65b7c31d3504cf55489153c60c1
Workaround: ---

Description Sven Gothel 2014-02-19 01:54:17 CET
CLBufferTest's test of createSubBuffer fails on AMD devices
correctly due to not considering CL_DEVICE_MEM_BASE_ADDR_ALIGN.

"CL_MISALIGNED_SUB_BUFFER_OFFSET is
returned in errcode_ret if there are no devices in
context associated with buffer for which the origin
value is aligned to the
CL_DEVICE_MEM_BASE_ADDR_ALIGN value."

Each device has it's CL_DEVICE_MEM_BASE_ADDR_ALIGN attribute
and buffers used for a specific device must be aligned to it.

In case the OpenCL implementation does not fail w/ above
operation, subsequent buffer operations (enqueue .. etc) may fail later.

+++
typedef unsigned __int32        cl_uint;

cl_uint CL_DEVICE_MEM_BASE_ADDR_ALIGN:

"The minimum value is the size (in bits) of the largest OpenCL built-in
data type supported by the device (long16 in FULL profile, 
long16 or int16 in EMBEDDED profile) for devices that are not of type
CL_DEVICE_TYPE_CUSTOM ."

+++

I experimented w/ reading our said value and aligned the sub-buffer
offset as specified, which works well.

However a major problem is that the offset and size values passed 
to CLBuffer.createSubBuffer(..) do not reflect the byte-size,
but the element-count.

Implementation must correct the byte-offset to be aligned properly.

Further more, another complication is the lack of knowledge 
for which device the buffer shall be aligned to.
Hence a device attribute may be given here as a hint.
If the latter does not to exist, we would need to align the buffer for all devices.

+++

I currently re-implement a general purpose bitstream reader
allowing properly fetching values like cl_uint .. etc.

+++
Comment 1 Sven Gothel 2014-02-21 08:39:46 CET
d4f04ddd3ef3b65b7c31d3504cf55489153c60c1
    Add 'CL_DEVICE_MEM_BASE_ADDR_ALIGN' to CLDevice and overal maximum to CLContext
    Split CLBufferTest and use alignment.
Comment 2 Wade Walker 2014-02-22 19:12:53 CET
Which tests does this affect? It looks like all the remaining failures except those on Mac OS X 10.6 cause the VM to crash, but I can't see the stack traces of the crash from the build server.
Comment 3 Sven Gothel 2014-02-23 00:47:12 CET
(In reply to comment #2)
> Which tests does this affect? It looks like all the remaining failures
> except those on Mac OS X 10.6 cause the VM to crash, but I can't see the
> stack traces of the crash from the build server.

Well, the crash is fixed now w/ the patch d4f04ddd3ef3b65b7c31d3504cf55489153c60c1,
i.e. the CLBufferTest works now <http://jogamp.org/git/?p=jocl.git;a=blobdiff;f=test/com/jogamp/opencl/CLBufferTest.java;h=5b5d0e38d19454f89eab920a69c87fb04a8536f9;hp=0638844dfee19ababb19540ddc14b9835c285481;hb=d4f04ddd3ef3b65b7c31d3504cf55489153c60c1;hpb=84e5e16a4aaa206c39b04b980d8d63ffacb97dbb>

(I always add a git-sha1 reference ..)

However .. maybe we should add 'auto-alignment'  option in that
'createSubBuffer' methods ?
Comment 4 Sven Gothel 2014-02-26 17:07:27 CET
Test CLBufferTest is stable now.

We may reconsider the auto-alignment at a later time
in it's own bug report.