Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Matt Groth
@mgroth0
I try to stay on the java side, which is why this lib is so appealing!
CoreRasurae
@CoreRasurae
@mgroth0 I've heard about Rosetta2 which can deal with x86_64 binaries and translate them to arm64
@mgroth0 maybe you can try a x86_64 JVM with Aparapi
Jeffrey Phillips Freeman
@freemo
@CoreRasurae am now (in egypt so limited connectivity)
CoreRasurae
@CoreRasurae
Hi
@freemo long time no see
Jeffrey Phillips Freeman
@freemo
@CoreRasurae been in egypt for 3 months
KrystilizeNevaDies
@KrystilizeNevaDies
Heya, I am getting a NPR on MethodModel#1633:
Cannot invoke "com.aparapi.internal.model.ClassModel$AttributePool$CodeEntry.getExceptionPoolEntries()" because the return value of "com.aparapi.internal.model.ClassModel$ClassModelMethod.getCodeEntry()" is null
Any ideas why/how to fix?
I can give a full stacktrace or my code if that helps
Using 3.0.1-SNAPSHOT
Same thing on 3.0.0
grfrost
@grfrost
Does your kernel have a none abstract 'run' method ?
@KrystilizeNevaDies the exception above seems to imply that there is no 'bytecode' associated with the 'run' method. This can happen if you dont have a run method on your kernel? Also if you have an abstract run method (expecting a superclass to implement it).
grfrost
@grfrost
@mgroth0 as @CoreRasurae suggested you may have to be the M1 'leader' here ;) Unless you want to send one of us a shiny new laptop ;) Do you have xcode on your mac? If so I can probably walk you through the steps to build, but sadly it would require some basic knowledge of C++ compilation.
@mgroth0 so I build on Mac using cmake (not using maven - have I mentioned today how much I hate maven ;) ). So if you can setup xcode + clang +cmake then I could talk you through the build process.
epiovesam
@epiovesam
Hello! I'm using Ubuntu 20.04 + latest Aparapi (3.0.0) + Oracle JDK 1.8 + NVIDIA RTX 2060 (Driver 470), and Aparapi says that I'm running an unsupported OpenCL 3.0 version... I've installed opencl 2.1 on Ubuntu with success, but Aparapi still says that I'm running OpenCL 3.0. I guess that it's getting an opencl nvidia driver and not the opencl 2.1 version which I've installed - is there an way to force aparapi to use an specific libopencl.so file? Or any other clue? Thanks in advance
CoreRasurae
@CoreRasurae
@epiovesam There is no need to go back with the OpenCL version, you can keep with the latest drivers, even if OpenCL 3.0. We have marked it as unsupported because we haven't done any special validation, but it seems to work. So I would suggest that you try it and only revert to a prior version if any issue arises. A special note must be made regarding the new Aparapi API for dealing with the workgroup size, that is OpenCL reports a default maximum workgroup size, but the actual maximum workgroup size allowed maybe smaller, depending on the actual compiled kernel, and the device OpenCL driver. To find what is the actual workgroup size allowed for a given kernel, please use kernel.getKernelMaxWorkGroupSize(device);
epiovesam
@epiovesam
@CoreRasurae thank you very much for the clarifications, we'll keep and try with OCL 3.0. Regards
epiovesam
@epiovesam
Hi again. I'm having some issues but, first of all, I'm to make sure that I'm not misunderstanding some Aparapi concepts.
Some time ago I was using OpenCL 1.2 + Ubuntu 16 + Aparapi 2.0 + Nvidia gtx old gpu , and I was able to call Kernel.execute with millions of range kernels with 268 passes, like Kernel.execute(16000000,268).
But today, with OpenCL 3.0 + Ubuntu 20 + Aparapi 3.0 + Nvidia rtx 2060, it says: "!!!!!!! Kernel overall local size: 1024 exceeds maximum kernel allowed local size of: 256 failed" . If I follow the getKernelMaxWorkGroupSize(device) info = 256, and pass only 256 range kernels it works... but it became limited to 256 kernels for each "pass" and I'm obligated to iterate (16000000/256) calls, which delays a lot the processing time compared to the OCL1.2+Aparapi 2.0 setup...
Could you please help me to understand what I'm doing wrong in this new config (or if I was doing something wrong in the old config)? Thanks again
CoreRasurae
@CoreRasurae
@epiovesam There is no problem with old or the new config. The reality is that NVIDIA changed the behavior of their drivers and it has not to do with OpenCL 3.0. Due to that NVIDIA driver behavior change, we had to make the Aparapi 3.0.0 on purpose which also required a small API change. So what you need is to use the call kernel.getKernelMaxWorkGroupSize(device); to adjust the kernel allowed max work group size. So there is nothing that you can do with the new NVIDIA drivers to work with higher workgroup sizes. I mean the kernel may run until the end with a previous Aparapi version and the new NVIDIA driver, while having 1024 workgroup size, but results are not guaranteed to be consistent/correct, so it is not recommended.
You may have to find other ways of optimizing the kernel execution time.
Freemo
@freemo:qoto.org
[m]
Hi, im back behind internet again if anyone needs anything
epiovesam
@epiovesam
@CoreRasurae thanks for the info! Understood, and we'll try to find other ways. Regards
Scuffi
@plusgithub
Hey all, just wondering if there's anyway to fetch an object based on the global id in a kernel so I can use it? Right now I've tried a few methods but get errors as only primitive datatypes are supported in arrays. If anyone has any idea if there would/could be a way to fetch and use objects would be great, thanks :)
grfrost
@grfrost

@plusgithub We can't use Java objects directly on the GPU due to the way that Java allocates them from non-contiguous memory on the heap. We can only rely on arrays of primitives being laid out appropriately. So alas no. If your 'objects' are just holding primitives. You can use a trick whereby you represent the objects as parallel arrays of objects.

So given say

class Record{
int x,y;
float value;
}

And you had an array of Records to process.....

You can allocate an array of ints for the (x and y)'s and another array for the float values

int Recordxys[] = new int[records.length*2];
floats Recordvalues[] = new int[records.length];

Then copy data from your records array to your parallel int and float array prior to kernel dispatch, then back again after the kernel dispatches.

Hopefully you have enough work to do in the kernel, to warrant this extra map step.

Dan Marcovecchio
@BlackHat0001
Does aparapi have any plans to support opencl 3.0?
grfrost
@grfrost
@BlackHat0001 When you say support. Do you mean support all features of OpenCL kernel Language 3.0 (which is likely never going to happen) or allow one to run Aparapi against an Open 3.0 compatible runtime. ? Do you have a particular vendor in mind ( I assume NVidia/Intel). As Aparapi maps bytecode to OpenCL kernel, we need to ensure that we create code that works all the way back to OpenCL 1.0 (well 1.2 probably). We could snoop the device and code possibly include patches for specific new features (pipes/lane aware instructions), but that is a lot of work, and leads to testing issues. We can only test on devices we have access to.
Do you have a specific OpenCL 3.0 feature in mind. Just curions.
Dan Marcovecchio
@BlackHat0001
@grfrost Apologies it appears my problem led me to blaming some ridiculous idea about OpenCL compatibility. I do not actually require OpenCL 3.0 full support.
Although I am having trouble selecting specific devices. I can only seem to use my GPU and I cant figure out how to multithread over the CPU
grfrost
@grfrost

@BlackHat0001 Sorry I was away. Did you sort this out?

To be able to use Aparapi/OpenCL on your CPU (as well as GPU) you will need a CPU based OpenCL runtime.

Intel has one (https://www.intel.com/content/www/us/en/developer/articles/tool/opencl-drivers.html), I think Apple also has one. NVidia does not, AMD used to, but I don't think they do anymore.

What platform are you on? Mac/Windows/Linux? What processor x64/aarch64?
I can recommend the Intel ones (as an ex AMDer that hurts me to type ;) ) it maps to AVX vector instructions pretty well.

There is an Open Source project called Pocl, which works OK. So if you really wanted to you could build it. http://portablecl.org/

CoreRasurae
@CoreRasurae
@BlackHat0001 Or do you mean to use CPU multi-threading to share the GPU across multiple GPU jobs? If the latter is the case, then you can do so, yes. Just ensure that you have an independent Kernel instance per each CPU thread and that you GPU has enough memory to accommodate all the kernel calls simultaneously.
CoreRasurae
@CoreRasurae

@BlackHat0001 There are several methods to select the specific device, although I recommend this one:

    public static List<OpenCLDevice> listDevices(OpenCLDevice.TYPE type) {
        final ArrayList<OpenCLDevice> results = new ArrayList<>();

        for (final OpenCLPlatform p : OpenCLPlatform.getUncachedOpenCLPlatforms()) {
            for (final OpenCLDevice device : p.getOpenCLDevices()) {
                if (type == null || device.getType() == type) {
                    results.add(device);
                }
            }
        }

        return results;
    }

and from there (as an example only):

    final Range range = Range.create2D(device, sideGlobal, sideGlobal, sideLocal, sideLocal);
    kernel.execute(range);
By using the getUncachedOpenCLPlatforms(), you are safe on reusing the device for another kernel, in case a given kernel execution fails on the device, which would cause the device to be evicted from Aparapi and no longer available for use. Although such behavior may be useful in production cases for well tested kernels, it makes no sense for development in my opinion.
freemo
@freemo:qoto.org
[m]
@CoreRasurae: and @grfrost either of you two know any good software engineers with DSP experience who are looking for work?
grfrost
@grfrost
@freemo:qoto.org I know folk working at Google, but probably not looking for new roles. Let me reach out to them just in case.
CoreRasurae
@CoreRasurae
@freemo:qoto.org I have some DSP experience, the problem is that I am still busy with the PhD. Can't recommend others, since at this moment I am not seeing who, from those I know, that has experience with such.
Jeffrey Phillips Freeman
@freemo
@grfrost @CoreRasurae thanks, looking to hire on behalf of company in Israel, good pay, I'll be the one who makes the call on hiring, Its DSP, Python, and some ML and chemistry on the peripheral of it all. Let me know if you hear of anyone, can be remote.
Matt Groth
@mgroth0
@grfrost responding to you from many months ago about compiling Aparapi for M1 Macs. I'm very sorry for taking so long to respond but I would glady take you up on your offer to compile the library in xcode under your guidance if that is still a possibility.
The main reason I haven't responded is that I forgot to check Gitter. Do you mind chatting over email? My email is mgroth49@gmail.com
I will install xcode on my computer and if you could email me a set of instructions, I'll get started and let you know how it goes...
Matt Groth
@mgroth0
Btw my level of experience with C++ or xcode is basically zero. But on the other hand I feel very much like an expert with Java and IntelliJ and know my way around a terminal, so hopefully some of that knowledge is transferable!
grfrost
@grfrost
@mgroth0 Actually we may have another option, depending on your urgency. I am waiting for my M1 mac to be delivered. Once I am in possession, I will try to get an Aparapi build. Looking at the delivery dates, sadly this looks like mid June at best.
Freemo
@freemo:qoto.org
[m]
@mgroth0: @grfrost is M1 mac not included under our current mac support?
grfrost
@grfrost
M1 is ARM aarch64. I have certainly not built on ARM. I have a machine on order... Planned to build once I get my hands on one.
Maybe there is a docker image that we could use to build?
Freemo
@freemo:qoto.org
[m]
@grfrost: ohhh, arm computers are like 40$, shouldnt be hard to get one. the problem is arm + mac... hmmm
CoreRasurae
@CoreRasurae
@freemo:qoto.org @grfrost Yes, the M1 Mac is not even fully supported in Linux, it is currently being reverse engineered. It isn't officially supported.
grfrost
@grfrost
@freemo:qoto.org $40 ;) So raspi pi then.
Matt Groth
@mgroth0
I can wait until June if that's when you expect to have it. Looking forward to it!