Bug 959

Summary: CLCommandQueueTest.concurrencyTest() doesn't set the local work size correctly for put1DRangeKernel()
Product: [JogAmp] Jocl Reporter: Wade Walker <wwalker3>
Component: openclAssignee: Wade Walker <wwalker3>
Status: RESOLVED FIXED    
Severity: normal CC: sgothel
Priority: ---    
Version: 1   
Hardware: All   
OS: all   
Type: --- SCM Refs:
0874fa955c0401dba9f54816a9654bb4380abed8
Workaround: ---

Description Wade Walker 2014-02-05 20:46:32 CET
The local work size should be the minimum of that allowed by the hardware in general, and what is possible for the specific kernel being run. Currently it's set to the hardware max, which fails on CPU OpenCL devices on Mac.

To fix, replace this:

int groupSize = queue2.getDevice().getMaxWorkItemSizes()[0];

with this:

int maxWorkItemSize = queue2.getDevice().getMaxWorkItemSizes()[0];
int kernelWorkGroupSize = (int)vectorAddKernel2.getWorkGroupSize( queue2.getDevice() );
int groupSize = Math.min( maxWorkItemSize, kernelWorkGroupSize );

I'll do some more testing and submit a patch for this.
Comment 1 Wade Walker 2014-02-08 21:10:50 CET
Fixes are available now in https://github.com/WadeWalker/jocl/compare/fix_jocl_bugs_959_960_963_964. I still need to test them on Windows and Linux though, so not ready to merge yet.