Bug 1075

Summary: Add support for OpenCL 1.2 and 2.0
Product: [JogAmp] Jocl Reporter: Wade Walker <wwalker3>
Component: openclAssignee: Wade Walker <wwalker3>
Status: RESOLVED FIXED    
Severity: enhancement CC: gouessej, sgothel, shipman.william
Priority: ---    
Version: 2.4.0   
Hardware: All   
OS: all   
Type: FEATURE SCM Refs:
edd9720fbb570e0fe177cc41d3612084ea8a7b17 ad95421312871694ec8009d9f92089c79b9f8607 215901444d8fd98a14f190e6aa10c17b84ee4b7d 2739f7be10bd7fbd11d4a85d2e7636793f9c815e ba691aeb833671c57f59d1bcfa1ffc842f09d6df 010a5205cc79351816b13ef504127152c44d36e3 e899c162c2dfefd58857b0dcc07ac56db6d6cd0d 2f3c7cd46f0faee9e973ce09fe9135dc62af80a1 050279e6ccd7a3fcc4031730f7e157a3d10e28cb 0135df38d6ccdf17283a26c9c56adb08a0e6d30e 37c656e3290ff855e1752f9b8a4b830da3000b85 4f45fe8db1b5c3790ef2659995c2065807c3fe74 7c7146428a0584ad38f036d24a70c15f700f9a8e 4638f4b3fdf4c946bda0b290a83652e4db00edea 9a90181ed1fb596275fee9ebca0f3d1093722ca9 e56a17d6d7780b8597c78ce50808c8da68d094b5 85dc2b1357470d2a51a69080d33aac442eae291f 2cef8bb275deb1e003c9c4b0e911cdc0195970d7 baef9bd212f6f7c24185254ba1755d3d83c057f9 394af6f8b0c648ff9d8005157dd47cb7968a0b12
Workaround: ---

Description Wade Walker 2014-09-21 19:48:40 CEST
Now that OpenCL 2.0 drivers are getting closer, it's time to start thinking about this. First needs a bit of research to figure out how to support co-existing multiple versions of OpenCL in a way similar to how JOGL does it for OpenGL (with GLProfile). May be complicated by the fact that OpenCL already uses the word "profile" to describe "full vs. embedded", and uses "platform version" to describe "1.1 vs 1.2 vs 2.0".
Comment 1 Sven Gothel 2014-09-21 20:57:41 CEST
(In reply to comment #0)
> First needs a bit of research to figure out how to support
> co-existing multiple versions of OpenCL in a way similar to how JOGL does it
> for OpenGL (with GLProfile). 

I like this strategy very much, ofc!

> May be complicated by the fact that OpenCL
> already uses the word "profile" to describe "full vs. embedded", and uses
> "platform version" to describe "1.1 vs 1.2 vs 2.0".

IMHO we can keep 'embedded' for a later day,
since e.g. OpenCL 1.1 (at least) is already supported on some mobile
devices .. like some ARM chips (working on latest chipsets w/ Android).

CL, CL11, CL12, CL20 interfaces mapping their respective profiles 
and being implemented by one common CLImpl class might be possible
and useful.

I will ofc assist / help as usual.
Comment 2 Wade Walker 2015-04-21 03:23:45 CEST
Making progress on this. I've got JOCL working with the official OpenCL 1.1 headers now (previously the headers were a hacked pre-release version). Now I'm starting on adding the ability to have multiple versions of the headers co-existing.
Comment 3 Wade Walker 2015-09-06 19:36:47 CEST
Got the OpenCL 1.2 and 2.0 headers passing through GlueGen, and compiling the resulting Java/C code successfully. Had to comment out an anonymous union in OpenCL 2.0's cl.h file, since GlueGen doesn't yet support this sort of thing:

typedef struct _cl_image_desc {
    cl_mem_object_type      image_type;
    size_t                  image_width;
    size_t                  image_height;
    size_t                  image_depth;
    size_t                  image_array_size;
    size_t                  image_row_pitch;
    size_t                  image_slice_pitch;
    cl_uint                 num_mip_levels;
    cl_uint                 num_samples;
    union {
      cl_mem                  buffer;
      cl_mem                  mem_object;
    };
} cl_image_desc;

Next step is to figure out at runtime which version of OpenCL is present and load the correct implementation class.
Comment 4 Wade Walker 2015-09-20 22:25:28 CEST
I've now got a version working that supports three different OpenCL implementations: CLImpl11, CLImpl12, and CLImpl20. At runtime, you can request the latest implementation your device supports, and it will return one of these types to you as a CL reference. Then you can cast it to the correct implementation type and call new OpenCL 1.2/2.0 methods on it.

The high-level JOCL interface (CLBuffer, CLSampler, etc.) still uses a CLImpl11 object internally. This is because I didn't want to create three separate versions of all the high-level classes. You can still use the high-level interface with CL 1.2/2.0, but you'll have to use the low-level interface to call specific functions that didn't exist in CL 1.1.

I haven't yet thought of an elegant way for the JOCL high-level interface to fully utilize CL 1.2/2.0/futureCL without exploding the number of classes and interfaces in JOCL. There are 13 separate interfaces for CLBuffer/CLSampler/etc, so to create 3 separate versions of each is pretty ugly (and will be worse when CL 2.1 and any future versions come out). I'm open to suggestions in this area :)
Comment 5 Sven Gothel 2015-09-22 00:30:36 CEST
(In reply to Wade Walker from comment #3)
> Got the OpenCL 1.2 and 2.0 headers passing through GlueGen, and compiling
> the resulting Java/C code successfully. Had to comment out an anonymous
> union in OpenCL 2.0's cl.h file, since GlueGen doesn't yet support this sort
> of thing:
> 
> typedef struct _cl_image_desc {
..
>     union {
>       cl_mem                  buffer;
>       cl_mem                  mem_object;
>     };
> } cl_image_desc;

We have to add support here ofc.

In the meantime, we could simply edit the structure
and decide for one of the values w/ the biggest size!
This allows dropping the union while preserving
accuracy.

> 
> Next step is to figure out at runtime which version of OpenCL is present and
> load the correct implementation class.

Excellent news!
Comment 6 Sven Gothel 2015-09-22 00:33:35 CEST
(In reply to Wade Walker from comment #4)
> I've now got a version working that supports three different OpenCL
> implementations: CLImpl11, CLImpl12, and CLImpl20. At runtime, you can
> request the latest implementation your device supports, and it will return
> one of these types to you as a CL reference. Then you can cast it to the
> correct implementation type and call new OpenCL 1.2/2.0 methods on it.

Excellent!

Will we have common interface like JOGL's GL2ES2,
allowing to represent the common subset to the user?
This in turn will allow writing CL profile agnostic code,
at least partially.

> 
> The high-level JOCL interface (CLBuffer, CLSampler, etc.) still uses a
> CLImpl11 object internally. This is because I didn't want to create three
> separate versions of all the high-level classes. You can still use the
> high-level interface with CL 1.2/2.0, but you'll have to use the low-level
> interface to call specific functions that didn't exist in CL 1.1.
> 
> I haven't yet thought of an elegant way for the JOCL high-level interface to
> fully utilize CL 1.2/2.0/futureCL without exploding the number of classes
> and interfaces in JOCL. There are 13 separate interfaces for
> CLBuffer/CLSampler/etc, so to create 3 separate versions of each is pretty
> ugly (and will be worse when CL 2.1 and any future versions come out). I'm
> open to suggestions in this area :)

Lets think about this later on, after we have figured
out the common GL profile interfaces.
Maybe .. those can help resolving / simplifying the issue here.
Comment 7 Sven Gothel 2015-09-22 00:43:44 CEST
[1] Will the implementing classes CLImpl11, CLImpl12, CLImpl20,
    being directly exposed to the user
    or via public interfaces, i.e. CL11, CL12, CL20?
    I prefer the latter, since it will be analog to [2].

[2] Common CL profiles as JOGL's GL2ES2
    being implemented by CLImpl11, CLImpl12, CLImpl20.
    For example:
     - CL11_20
     - CL20_21
    This will be analog to [1].

[3] How to solve/maintain the higher level API
    using the common CL profiles? -> [1] [2]
Comment 8 Wade Walker 2015-09-24 02:35:00 CEST
(In reply to Sven Gothel from comment #7)
> [1] Will the implementing classes CLImpl11, CLImpl12, CLImpl20,
>     being directly exposed to the user
>     or via public interfaces, i.e. CL11, CL12, CL20?
>     I prefer the latter, since it will be analog to [2].

Let me think about this for a little. The problem is, JOCL is written as a class-based wrapper around OpenCL, so we have this kind of structure, from the parent to child:

- CL*Binding: 11 interfaces, each with a .cfg file; each contains a subset of the CL functions that is selected manually in the .cfg
- CL: interface that inherits all 11 CL*Binding interfaces
- CLGL: interface that adds in GL-related functions
- CLImpl11/CLImpl12/CLImp20: class to implement all functions. These are what the user uses for the low-level interface.

Then there are also:

- CL*: classes to match the CL*Binding; each of them contains a CL*Binding object. These are what the end user actually uses for the high-level interface.

So the problem with versioning the interfaces/classes is, which ones do you version? There are three main choices:

- Version only CLImpl. This requires only one new class per OpenCL version, but doesn't let you use new OpenCL in the high-level interface, only in the low-level. This is what I have done up to now. Downside is the high-level interface stays at OpenCL 1.1 level.

- Version CL*Binding/CL/CLGL/CLImpl: This requires 14 new interfaces/classes per OpenCL version, and 14 new *.cfg files. It also requires the insides of the CL* classes to be flexible in what version of the bindings they contain; the classes would have to have OpenCL-version-dependent code inside them. I thought about doing this, but it would have required 42 new interfaces/classes + 42 new config files + touching all the CL* classes to get us to OpenCL 2.1.

- Version everything: This requires ~30 new interfaces/classes per OpenCL version, and gives you an explicitly versioned high-level interface. This is "filemageddon" -- we'd have like 100 new files :)

This is why I'm trying to think of a simpler way :) Ideally, there'd be a way where we create only a few new files per OpenCL version, and where we don't have a lot of hand-written, version-dependent code.
Comment 9 Wade Walker 2015-10-03 20:20:13 CEST
I thought of an idea to try: if I remove the 11 CL*Binding interfaces, this will greatly cut down on the number of interfaces we have to version. On the bad side, it means that high-level objects like CLKernel will internally use a CL interface directly, instead of using a CLKernelBinding interface that limits what functions of CL can be used. But on the good side, the number of gluegen invocations will be far less, and the inheritance structure of the generated interfaces/classes will be much simpler.

I'll try this speculatively and see if it blows up in my face :) If so, I can always revert it. If it works, then I can try creating:

- interfaces: CL11, CL12, CL20
- implementations: CLImpl11, CLImpl12, CLImpl20
Comment 10 Sven Gothel 2015-10-03 21:43:59 CEST
(In reply to Wade Walker from comment #9)
> I thought of an idea to try: if I remove the 11 CL*Binding interfaces, this
> will greatly cut down on the number of interfaces we have to version. On the
> bad side, it means that high-level objects like CLKernel will internally use
> a CL interface directly, instead of using a CLKernelBinding interface that
> limits what functions of CL can be used. But on the good side, the number of
> gluegen invocations will be far less, and the inheritance structure of the
> generated interfaces/classes will be much simpler.
> 
> I'll try this speculatively and see if it blows up in my face :) If so, I
> can always revert it. If it works, then I can try creating:
> 
> - interfaces: CL11, CL12, CL20
> - implementations: CLImpl11, CLImpl12, CLImpl20

Yes, thats what I meant!

You then can even reduce it all to CLImpl20 .. impl.
all of them, if CLImpl20 contains all the other subs!
(Like JOGL's GL4bcImpl)

Note: This is for 2.4.0, i.e. API changes are fine.

So the higher level stuff can all use interfaces
and query the profile in use for optimizations.
Comment 11 Wade Walker 2015-10-04 23:05:00 CEST
OK, now I have the following hierarchy working:

CL --> CL11 --> CLImpl11
   \-> CL12 --> CLImpl12
   \-> CL20 --> CLImpl20

Where CL/CL11/CL12/CL20 are interfaces, and CLImpl* are classes. Gluegen is called only 4 times now, once to create CL.java, and once each for the three interface/implementation pairs. Future OpenCL versions should only require the addition of one *.cfg file.

Next step is cleanup, moving the interfaces to different packages, et cetera.
Comment 12 Wade Walker 2015-11-01 21:17:38 CET
I've got a candidate version of these changes ready at https://github.com/WadeWalker/jocl/commits/bug_1075_add_new_opencl_versions. It compiles and passes all tests on Windows 10, Ubuntu 14.04, and OS X El Capitan. Please let me know if anyone has any final inputs before I check it in next weekend.
Comment 13 Wade Walker 2015-11-08 22:05:31 CET
Code merged into master on my GitHub, JogAmp GitHub, and jogamp.org Git.