Bug 959 - CLCommandQueueTest.concurrencyTest() doesn't set the local work size correctly for put1DRangeKernel()
Summary: CLCommandQueueTest.concurrencyTest() doesn't set the local work size correctl...
Status: RESOLVED FIXED
Alias: None
Product: Jocl
Classification: JogAmp
Component: opencl (show other bugs)
Version: 1
Hardware: All all
: --- normal
Assignee: Wade Walker
URL:
Depends on:
Blocks:
 
Reported: 2014-02-05 20:46 CET by Wade Walker
Modified: 2014-02-12 03:02 CET (History)
1 user (show)

See Also:
Type: ---
SCM Refs:
0874fa955c0401dba9f54816a9654bb4380abed8
Workaround: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wade Walker 2014-02-05 20:46:32 CET
The local work size should be the minimum of that allowed by the hardware in general, and what is possible for the specific kernel being run. Currently it's set to the hardware max, which fails on CPU OpenCL devices on Mac.

To fix, replace this:

int groupSize = queue2.getDevice().getMaxWorkItemSizes()[0];

with this:

int maxWorkItemSize = queue2.getDevice().getMaxWorkItemSizes()[0];
int kernelWorkGroupSize = (int)vectorAddKernel2.getWorkGroupSize( queue2.getDevice() );
int groupSize = Math.min( maxWorkItemSize, kernelWorkGroupSize );

I'll do some more testing and submit a patch for this.
Comment 1 Wade Walker 2014-02-08 21:10:50 CET
Fixes are available now in https://github.com/WadeWalker/jocl/compare/fix_jocl_bugs_959_960_963_964. I still need to test them on Windows and Linux though, so not ready to merge yet.