Bug 959 - CLCommandQueueTest.concurrencyTest() doesn't set the local work size correctly for put1DRangeKernel()
Summary: CLCommandQueueTest.concurrencyTest() doesn't set the local work size correctl...
Alias: None
Product: Jocl
Classification: JogAmp
Component: opencl (show other bugs)
Version: 1
Hardware: All all
: --- normal
Assignee: Wade Walker
Depends on:
Reported: 2014-02-05 20:46 CET by Wade Walker
Modified: 2014-02-12 03:02 CET (History)
1 user (show)

See Also:
Type: ---
SCM Refs:
Workaround: ---


Note You need to log in before you can comment on or make changes to this bug.
Description Wade Walker 2014-02-05 20:46:32 CET
The local work size should be the minimum of that allowed by the hardware in general, and what is possible for the specific kernel being run. Currently it's set to the hardware max, which fails on CPU OpenCL devices on Mac.

To fix, replace this:

int groupSize = queue2.getDevice().getMaxWorkItemSizes()[0];

with this:

int maxWorkItemSize = queue2.getDevice().getMaxWorkItemSizes()[0];
int kernelWorkGroupSize = (int)vectorAddKernel2.getWorkGroupSize( queue2.getDevice() );
int groupSize = Math.min( maxWorkItemSize, kernelWorkGroupSize );

I'll do some more testing and submit a patch for this.
Comment 1 Wade Walker 2014-02-08 21:10:50 CET
Fixes are available now in https://github.com/WadeWalker/jocl/compare/fix_jocl_bugs_959_960_963_964. I still need to test them on Windows and Linux though, so not ready to merge yet.