Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Prasun Anand
lgsl seems to be at fault here
Nicholas Wilson
ldmd2 -of.dub/build/application-debug-linux.posix-x86_64-ldc_2071-1FDACC5F6D6CDE6055498809E19FDA1C/gemm .dub/build-.../gemm.o 
 -L--no-as-needed -L-L/home/prasun/.dub/packages/mir-glas-0.1.1/mir-glas/ -L-L/home/prasun/.dub/packages/mir-cpuid-0.4.2/mir-cpuid/ -L-L/usr/lib/openblas-base -L-llapacke -L-llapack -L-lblas -L-lgsl -L-lgslcblas -L-lm -L-lopenblas -L-lmir-cpuid -L-lmir-glas -g
As above with line breaks and path contraction.
Can you please verify the presence of the missing symbols in the gist. i.e. check the output of $nm path/to/libmir-cpuid.a | grep cpuid_init and the same for cpuid_dCacheand cpuid_uCache
Ilya Yaroshenko
@prasunanand please fill the issue at mir-random with your dub.json and gists
Prasun Anand
@thewilsonator Output:
prasun@devUbuntu:~/dev/temp/gemm$ nm ~/.dub/packages/mir-cpuid-0.4.2/mir-cpuid/libmir-cpuid.a | grep cpuid_init
0000000000000000 T cpuid_init
prasun@devUbuntu:~/dev/temp/gemm$ nm ~/.dub/packages/mir-cpuid-0.4.2/mir-cpuid/libmir-cpuid.a | grep cpuid_dCache
0000000000000000 T cpuid_dCache
prasun@devUbuntu:~/dev/temp/gemm$ nm ~/.dub/packages/mir-cpuid-0.4.2/mir-cpuid/libmir-cpuid.a | grep cpuid_uCache
0000000000000000 T cpuid_uCache
Prasun Anand
@9il libmir/mir-random#22
Prasun Anand
-L-lmir-glas should be followed by -L-lmir-cpuid. I manually built the executable after running dub build --compiler=ldmd2 --parallel --force -v
Ilya Yaroshenko
@prasunanand Please fill this bug to DUB https://github.com/dlang/dub
Prasun Anand
Yeah :)
Sebastian Wilzbach
@prasunanand don't get to frustrated with the D ecosystem. It has it downsides, but it's still moving quite fast and improving :)
Prasun Anand
@wilzbach , D is a lot better than other languages out there. With great performance, there may be some downsides. I am hooked to D for the speed and syntactic sugar(similar to Ruby), it offers :) .
Ilya Yaroshenko
Prasun Anand
@9il : Is --build=release-nobounds parameter necessary for improved performance of mir-glas gemm routine?
I am multiplying two rectangular matrices of shape [1217, 8000] and
[8000, 1217] and benchmarked it for OpenBLAS and mir-glas.
For mir-glas
Time taken for gemm =>1 sec, 267 ms, 309 μs, and 3 hnsecs
For OpenBLAS
Time taken for gemm =>522 ms, 456 μs, and 7 hnsecs
Prasun Anand
Currently, I can't compile with --build=release-nobounds because of dub error.
Ilya Yaroshenko
@prasunanand Please open an issue for mir-glas :-)
Ilya Yaroshenko
@prasunanand GLAS is 2 x slower with LLVM 4.0. Probably you need to use LDC based on LLVM 3.9
Ilya Yaroshenko
Mir random v0.2.x was released. Random ndslice generation was added.
import mir.ndslice: slicedField, slice;
import mir.random;
import mir.random.variable: NormalVariable;
import mir.random.algorithm: field;

auto var = NormalVariable!double(0, 1);
auto rng = Random(unpredictableSeed);
auto sample = rng      // passed by reference
    .field(var)        // construct random field from standard normal distribution
    .slicedField(5, 3) // construct random matrix 5 row x 3 col (lazy, without allocation)
    .slice;            // allocates data of random matrix

import std.stdio;
Prasun Anand
Thank You @9il . I will switch to LLVM 3.9 :)
Mathias L. Baumann
I was wondering if there is a way to get the shape of a ndslice-type
or of a ndslice variable but at compile time
basically I want to construct a new ndslice that is a combination of the dimensions of two other ndslices
but I am unable to access N or _lengths or anything that would help me
Ilya Yaroshenko
Hey @Marenz:
  1. If you are using new ndslice. *._lengths parameter is public and accessible. Please fill issue if it does not. _lengths.length can be used instead of N. *._lengths are mutable. http://docs.algorithm.dlang.io/latest/mir_ndslice_slice.html#.Slice._lengths
  2. *.shape, and *.shape.length, http://docs.algorithm.dlang.io/latest/mir_ndslice_slice.html#.Slice.shape
  3. isSlice!T[0] returns the same value as *.shape.length. http://docs.algorithm.dlang.io/latest/mir_ndslice_slice.html#.isSlice
Does it work for you?
Ilya Yaroshenko
Mir Algorithm v0.5.8: Interpolation, Timeseries and 17 new functions http://forum.dlang.org/post/pheyabivuumvqbessaok@forum.dlang.org
Not sure if this is the right place to ask this, but as part of looking at mir.ndslice, I was going to port a simple lattice Boltzmann fluid dynamics simulation for learning purposes, starting with a collision kernel:
which is currently a literal, non-idiomatic port of a C++ example:
Ignoring the non-idiomatic loop syntax and similar details, the D version is over 40x slower (LDC v.1.2.0, release build with -O3 and no bounds checks, compared vs. clang v4.0.0 -O3 on a Haswell CPU), which means I'm doing something horribly wrong. Having gone through the docs (and part of the vision library) and checked that the results are correct, I'm somewhat at a loss.
Does anyone see a glaring error that would lead to this level of performance degradation?
Ilya Yaroshenko
Hello @dextorious
Yes, the C++ code has single indexing for vectors while D code has doouble indexing got matrixes
You may want to declare vectors in the begining of the outer loop
like auto uxv = ux[i];
and operate with this vectors in the internal loop
D (and probably C/C++) can not vectorise double indexing like Fortran
Ilya Yaroshenko
Finally the performance should be the same
Keep us in touch, I think it is a good example of porting and you can write a short blog post after (this would be very helpful for others)
Also, you may want to use https://github.com/libmir/mir-random . It implements C++ RNG standrd and more
Johan Engelen
@9il It would help a lot if you can extract a minimal example that shows that things are not vectorized/optimized well. There is so much going on in the current example that it's hard to analyze why things don't optimize well. Part of the problem could be that slices are used which don't optimize so well yet (it's a work-in-progress).