Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
epiovesam
@epiovesam
@CoreRasurae thanks for the info! Understood, and we'll try to find other ways. Regards
Scuffi
@plusgithub
Hey all, just wondering if there's anyway to fetch an object based on the global id in a kernel so I can use it? Right now I've tried a few methods but get errors as only primitive datatypes are supported in arrays. If anyone has any idea if there would/could be a way to fetch and use objects would be great, thanks :)
grfrost
@grfrost

@plusgithub We can't use Java objects directly on the GPU due to the way that Java allocates them from non-contiguous memory on the heap. We can only rely on arrays of primitives being laid out appropriately. So alas no. If your 'objects' are just holding primitives. You can use a trick whereby you represent the objects as parallel arrays of objects.

So given say

class Record{
int x,y;
float value;
}

And you had an array of Records to process.....

You can allocate an array of ints for the (x and y)'s and another array for the float values

int Recordxys[] = new int[records.length*2];
floats Recordvalues[] = new int[records.length];

Then copy data from your records array to your parallel int and float array prior to kernel dispatch, then back again after the kernel dispatches.

Hopefully you have enough work to do in the kernel, to warrant this extra map step.

Dan Marcovecchio
@BlackHat0001
Does aparapi have any plans to support opencl 3.0?
grfrost
@grfrost
@BlackHat0001 When you say support. Do you mean support all features of OpenCL kernel Language 3.0 (which is likely never going to happen) or allow one to run Aparapi against an Open 3.0 compatible runtime. ? Do you have a particular vendor in mind ( I assume NVidia/Intel). As Aparapi maps bytecode to OpenCL kernel, we need to ensure that we create code that works all the way back to OpenCL 1.0 (well 1.2 probably). We could snoop the device and code possibly include patches for specific new features (pipes/lane aware instructions), but that is a lot of work, and leads to testing issues. We can only test on devices we have access to.
Do you have a specific OpenCL 3.0 feature in mind. Just curions.
Dan Marcovecchio
@BlackHat0001
@grfrost Apologies it appears my problem led me to blaming some ridiculous idea about OpenCL compatibility. I do not actually require OpenCL 3.0 full support.
Although I am having trouble selecting specific devices. I can only seem to use my GPU and I cant figure out how to multithread over the CPU
grfrost
@grfrost

@BlackHat0001 Sorry I was away. Did you sort this out?

To be able to use Aparapi/OpenCL on your CPU (as well as GPU) you will need a CPU based OpenCL runtime.

Intel has one (https://www.intel.com/content/www/us/en/developer/articles/tool/opencl-drivers.html), I think Apple also has one. NVidia does not, AMD used to, but I don't think they do anymore.

What platform are you on? Mac/Windows/Linux? What processor x64/aarch64?
I can recommend the Intel ones (as an ex AMDer that hurts me to type ;) ) it maps to AVX vector instructions pretty well.

There is an Open Source project called Pocl, which works OK. So if you really wanted to you could build it. http://portablecl.org/

CoreRasurae
@CoreRasurae
@BlackHat0001 Or do you mean to use CPU multi-threading to share the GPU across multiple GPU jobs? If the latter is the case, then you can do so, yes. Just ensure that you have an independent Kernel instance per each CPU thread and that you GPU has enough memory to accommodate all the kernel calls simultaneously.
CoreRasurae
@CoreRasurae

@BlackHat0001 There are several methods to select the specific device, although I recommend this one:

    public static List<OpenCLDevice> listDevices(OpenCLDevice.TYPE type) {
        final ArrayList<OpenCLDevice> results = new ArrayList<>();

        for (final OpenCLPlatform p : OpenCLPlatform.getUncachedOpenCLPlatforms()) {
            for (final OpenCLDevice device : p.getOpenCLDevices()) {
                if (type == null || device.getType() == type) {
                    results.add(device);
                }
            }
        }

        return results;
    }

and from there (as an example only):

    final Range range = Range.create2D(device, sideGlobal, sideGlobal, sideLocal, sideLocal);
    kernel.execute(range);
By using the getUncachedOpenCLPlatforms(), you are safe on reusing the device for another kernel, in case a given kernel execution fails on the device, which would cause the device to be evicted from Aparapi and no longer available for use. Although such behavior may be useful in production cases for well tested kernels, it makes no sense for development in my opinion.
freemo
@freemo:qoto.org
[m]
@CoreRasurae: and @grfrost either of you two know any good software engineers with DSP experience who are looking for work?
grfrost
@grfrost
@freemo:qoto.org I know folk working at Google, but probably not looking for new roles. Let me reach out to them just in case.
CoreRasurae
@CoreRasurae
@freemo:qoto.org I have some DSP experience, the problem is that I am still busy with the PhD. Can't recommend others, since at this moment I am not seeing who, from those I know, that has experience with such.
Jeffrey Phillips Freeman
@freemo
@grfrost @CoreRasurae thanks, looking to hire on behalf of company in Israel, good pay, I'll be the one who makes the call on hiring, Its DSP, Python, and some ML and chemistry on the peripheral of it all. Let me know if you hear of anyone, can be remote.
Matt Groth
@mgroth0
@grfrost responding to you from many months ago about compiling Aparapi for M1 Macs. I'm very sorry for taking so long to respond but I would glady take you up on your offer to compile the library in xcode under your guidance if that is still a possibility.
The main reason I haven't responded is that I forgot to check Gitter. Do you mind chatting over email? My email is mgroth49@gmail.com
I will install xcode on my computer and if you could email me a set of instructions, I'll get started and let you know how it goes...
Matt Groth
@mgroth0
Btw my level of experience with C++ or xcode is basically zero. But on the other hand I feel very much like an expert with Java and IntelliJ and know my way around a terminal, so hopefully some of that knowledge is transferable!
grfrost
@grfrost
@mgroth0 Actually we may have another option, depending on your urgency. I am waiting for my M1 mac to be delivered. Once I am in possession, I will try to get an Aparapi build. Looking at the delivery dates, sadly this looks like mid June at best.
Freemo
@freemo:qoto.org
[m]
@mgroth0: @grfrost is M1 mac not included under our current mac support?
grfrost
@grfrost
M1 is ARM aarch64. I have certainly not built on ARM. I have a machine on order... Planned to build once I get my hands on one.
Maybe there is a docker image that we could use to build?
Freemo
@freemo:qoto.org
[m]
@grfrost: ohhh, arm computers are like 40$, shouldnt be hard to get one. the problem is arm + mac... hmmm
CoreRasurae
@CoreRasurae
@freemo:qoto.org @grfrost Yes, the M1 Mac is not even fully supported in Linux, it is currently being reverse engineered. It isn't officially supported.
grfrost
@grfrost
@freemo:qoto.org $40 ;) So raspi pi then.
Matt Groth
@mgroth0
I can wait until June if that's when you expect to have it. Looking forward to it!
Mandeep Singh
@mandeepsingh-private
Please help with this issue Syncleus/aparapi#168
the code is unable to detect the GPU
GPU: Nvidia RTX 3080
OpenCL version: 3.0
Operating System: Pop OS (Ubuntu)
grfrost
@grfrost
@freemo do we have instructions on how to build from maven repos for @mandeepsingh-private . He has built java and C++ code before, and we may need for him to build his own aparapi.....
grfrost
@grfrost

@mandeepsingh-private try cloning and building from this

https://github.com/grfrost/javacltest

See if it builds and runs.

Matt Groth
@mgroth0
Hi how is the M1 Mac release coming along? It would be great to try this library out with my work soon. And please let me know if you'd like me to try building from source. I just will need a little guidance
grfrost
@grfrost
@mgroth0 good timing. My M1 laptop is arriving 'by the weekend' alledgedly, as soon as I get it setup I will try an Aparapi build and let you know my findings.
@mgroth0 although the repo I made for @mandeepsingh-private above, may also be interesting for you to try.
You will need to make sure you have Java, C++ dev and OpenCL. The test project at
https://github.com/grfrost/javacltest
Will ensure you have these set up....
grfrost
@grfrost

OK I got my Apple Mac Pro with M1 and have instructions for building/patching aparapi

@freemo:qoto.org let me know if you want me to send you the libaparapi_aarch64.dylib

@mgroth0 pull this repo from git and follow the README.md.

https://github.com/grfrost/aparapi-m1
Should take 10 minutes tops

@freemo the README.md also contains the patch for NativeLoader to load the aarch64 dylib

1 reply
grfrost
@grfrost

@mandeepsingh-private you might be able to rebuild and test using

https://github.com/grfrost/aparapi-m1

Build and run works for me on Apple M1 and linux x64 platforms. So hopefully it will find you NVidia CUDA OpenCL lib....

grfrost
@grfrost

@freemo I had a crash on linux. Tracked it to this.

The extra %s in the fprintf was causing grief ;)

diff --git a/src/cpp/JNIExceptions.h b/src/cpp/JNIExceptions.h
index bcb44a0..3ff72f3 100644
--- a/src/cpp/JNIExceptions.h
+++ b/src/cpp/JNIExceptions.h
@@ -51,7 +51,7 @@ public:

    void printError() {
       if(_message != "") {
-         fprintf(stderr, "!!!!!!! %s failed %s\n", message());
+         fprintf(stderr, "!!!!!!! failed %s\n", _message.c_str());^M
       }
    }

Actually we may want to check the handling of code that uses the pattern return std::string('"....").c_str() this is dangerous as we are returning a pointer from a temporary object.

Matt Groth
@mgroth0
@grfrost thanks so much. I could not build it yet, but here are my results.
First, javac was not found. So I changed the line cmake --build build --target javac --target jar to cmake --build build --target /usr/bin/javac --target jar
This helped. But then I ran into another error:
cmake --build build --target /usr/bin/javac --target jar
[  5%] Building CXX object CMakeFiles/aparapi_aarch64.dir/syncleus/aparapi-native/src/cpp/CLHelper.cpp.o
In file included from /Users/matthewgroth/registered/ide/aparapi-builder-cmake/syncleus/aparapi-native/src/cpp/CLHelper.cpp:55:
In file included from /Users/matthewgroth/registered/ide/aparapi-builder-cmake/syncleus/aparapi-native/src/cpp/CLHelper.h:57:
/Users/matthewgroth/registered/ide/aparapi-builder-cmake/syncleus/aparapi-native/src/cpp/Common.h:77:10: fatal error: 'jni.h' file not found
#include <jni.h>
         ^~~~~~~
1 error generated.
make[3]: *** [CMakeFiles/aparapi_aarch64.dir/syncleus/aparapi-native/src/cpp/CLHelper.cpp.o] Error 1
make[2]: *** [CMakeFiles/aparapi_aarch64.dir/all] Error 2
make[1]: *** [CMakeFiles/jar.dir/rule] Error 2
make: *** [jar] Error 2
grfrost
@grfrost
cmake --build build --target javac --target jar

is correct. You may need to edit the CMakeLists.txt to point JAVA_HOME (in the CMakeLists.txt) to your JDK.

We need JAVA_HOME to point to a JDK so we can find javac, java and jar . Also ${JAVA_HOME}/include is where we find jni.h. The error.

Once you change JAVA_HOME (or anything in CMakeLists.txt)

cmake build 
cmake --build build --target javac --target jar

The first line rescans the CMakeLists.txt to create the make/ninja files.

nibblyn
@nibblyn:matrix.org
[m]
I have made a simple app for fractal music.
https://github.com/betaiotazeta/FractalMusicGenerator
It is packaged with jpackage and works well for me with Nvidia 1080. Some users are complaining that the app launches, then crashes with the following message: "child process exited with code 1" when running on AMD 6900XT or Nvidia 3080. They can run the app in a virtual machine. My understanding is that Aparapi switches to JTP automatically when something is wrong with the GPU, but for those high-end cards it doesn't seem to happen and an error occurs instead. I can't debug this issue because I don't have the hardware. How should I deal with this, by testing for something when the app starts? Should I always start in JTP and then give the option to switch to GPU? Thanks for your efforts with Aparapi, have a nice day!
CoreRasurae
@CoreRasurae
@freemo @grfrost Guys, I've found a couple o crticial issues in Aparapi. In this new version 3.0.0 the mapped methods and NoCL methods wrongly trigger an assert condition in Entrypoint.getCallTarget(...) method. Secondly, for a much greater amount of time private memory space arrays are completely broken and unusable. I'll try to provide fixes for them in the next days, if I can. Plus I would also like to add support for the integer correspondent of mad, named mad24, so that I would like to make that a mapped method. Provided there is no objections from any of you. For me it is quite useful when computing indices of 2D arrays mapped as a single array. The 24 bits limitation can also be forced on the Java side through a bit mask. What do you say?
this is an urgency fix that I am trying to bring as my work is depending on this, besides that I am without a minute of free time... almost working round the clock, to get a software ready
1 reply
CoreRasurae
@CoreRasurae
boolean fields are also broken...
CoreRasurae
@CoreRasurae
I will need to publish this new software somewhere in September/October and I would like that it would go with a released version of Aparapi
Private memory arrays cannot be directly initialized from the host, and currently Aparapi sets up Kernel arguments for those, which is completely wrong. In the MRs that I will provide they will be filtered out. Plus Aparapi is currently marking private memory arrays as belonging to the constant memory space. All completely wrong. I am testing fixes for that at this moment.
grfrost
@grfrost
@CoreRasurae sounds like some good fixes. I certainly have no objections.
I am interested in how to implement the initialization of private memory. Maybe you plan on generating init code in the generated OpenCL?
Would the Mad24 code fall back to Java ok.
I have a private copy of Aparapi native code which compiles sans warnings on modern C++/Clang/Mac + Linux. It is in a set of patches applied to mainline which I have not got around to make pull requests for. If you want them, before you start let me know.
CoreRasurae
@CoreRasurae
@grfrost Yes, the private memory arrays can only be initialized in the OpenCL code itself, however those initializations are naturally translated from Java code by Aparapi. I have implemented and tested kernel running with private memory arrays and it didn't show up any issues. Yes, mad24 does have a Java implementation for it. I won't touch Aparapi-native for this changes, my MRs only touch the Java side.