dependabot[bot] on bundler
dependabot[bot] on bundler
Bump rack from 2.1.1 to 2.2.3.1… (compare)
Cannot invoke "com.aparapi.internal.model.ClassModel$AttributePool$CodeEntry.getExceptionPoolEntries()" because the return value of "com.aparapi.internal.model.ClassModel$ClassModelMethod.getCodeEntry()" is null
kernel.getKernelMaxWorkGroupSize(device);
kernel.getKernelMaxWorkGroupSize(device);
to adjust the kernel allowed max work group size. So there is nothing that you can do with the new NVIDIA drivers to work with higher workgroup sizes. I mean the kernel may run until the end with a previous Aparapi version and the new NVIDIA driver, while having 1024 workgroup size, but results are not guaranteed to be consistent/correct, so it is not recommended.
@plusgithub We can't use Java objects directly on the GPU due to the way that Java allocates them from non-contiguous memory on the heap. We can only rely on arrays of primitives being laid out appropriately. So alas no. If your 'objects' are just holding primitives. You can use a trick whereby you represent the objects as parallel arrays of objects.
So given say
class Record{
int x,y;
float value;
}
And you had an array of Records to process.....
You can allocate an array of ints for the (x and y)'s and another array for the float values
int Recordxys[] = new int[records.length*2];
floats Recordvalues[] = new int[records.length];
Then copy data from your records array to your parallel int and float array prior to kernel dispatch, then back again after the kernel dispatches.
Hopefully you have enough work to do in the kernel, to warrant this extra map step.
@BlackHat0001 Sorry I was away. Did you sort this out?
To be able to use Aparapi/OpenCL on your CPU (as well as GPU) you will need a CPU based OpenCL runtime.
Intel has one (https://www.intel.com/content/www/us/en/developer/articles/tool/opencl-drivers.html), I think Apple also has one. NVidia does not, AMD used to, but I don't think they do anymore.
What platform are you on? Mac/Windows/Linux? What processor x64/aarch64?
I can recommend the Intel ones (as an ex AMDer that hurts me to type ;) ) it maps to AVX vector instructions pretty well.
There is an Open Source project called Pocl, which works OK. So if you really wanted to you could build it. http://portablecl.org/
@BlackHat0001 There are several methods to select the specific device, although I recommend this one:
public static List<OpenCLDevice> listDevices(OpenCLDevice.TYPE type) {
final ArrayList<OpenCLDevice> results = new ArrayList<>();
for (final OpenCLPlatform p : OpenCLPlatform.getUncachedOpenCLPlatforms()) {
for (final OpenCLDevice device : p.getOpenCLDevices()) {
if (type == null || device.getType() == type) {
results.add(device);
}
}
}
return results;
}
and from there (as an example only):
final Range range = Range.create2D(device, sideGlobal, sideGlobal, sideLocal, sideLocal);
kernel.execute(range);