Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Nov 27 17:18
    lehins closed #119
  • Nov 27 12:18
    lehins synchronize #119
  • Nov 27 02:14
    lehins synchronize #119
  • Nov 26 22:27
    lehins opened #121
  • Nov 26 22:24
    lehins synchronize #119
  • Nov 26 22:18
    lehins edited #119
  • Nov 26 22:17
    lehins opened #120
  • Nov 26 21:55
    lehins opened #119
  • Aug 15 10:19
    lehins closed #118
  • Aug 15 00:07
    lehins synchronize #118
  • Aug 14 22:52
    lehins synchronize #118
  • Aug 14 14:15
    lehins synchronize #118
  • Aug 13 19:37
    lehins synchronize #118
  • Aug 07 19:50
    lehins synchronize #118
  • Aug 07 18:58
    lehins synchronize #118
  • Aug 07 14:04
    lehins synchronize #118
  • Aug 07 10:33
    lehins synchronize #118
  • Jul 11 01:00
    lehins synchronize #118
  • Jul 10 21:41
    lehins synchronize #118
  • Jul 05 00:04
    lehins synchronize #118
Alexey Kuleshevich
@lehins:matrix.org
[m]
I am glad you like it 👍️
Alexey Kuleshevich
@lehins:matrix.org
[m]
Man of Letters: Sorry for a bit of misinformation. I decided to confirm the performance comparison once again of hmatrix and massiv with regards to matrix multiplication and massiv is about 2-3 times slower. Which is expected due to lack of simd parallelizm. However, both hmatrix and massiv do utilize all cores:
benchmarking HMatrix/MxM Double - (500x800 X 800x500)/Par
time                 3.764 ms   (1.154 ms .. 7.469 ms)
                     0.271 R²   (0.191 R² .. 0.665 R²)
mean                 2.365 ms   (1.725 ms .. 4.238 ms)
std dev              2.942 ms   (1.606 ms .. 5.442 ms)
variance introduced by outliers: 98% (severely inflated)

benchmarking HMatrix/MxM Float - (500x800 X 800x500)/Par
time                 3.394 ms   (2.047 ms .. 5.479 ms)
                     0.364 R²   (0.212 R² .. 0.653 R²)
mean                 4.380 ms   (3.424 ms .. 5.179 ms)
std dev              2.355 ms   (2.036 ms .. 2.683 ms)
variance introduced by outliers: 98% (severely inflated)

benchmarking Massiv/MxM P Double - (500x800 X 800x500)/Par
time                 6.874 ms   (6.762 ms .. 7.026 ms)
                     0.997 R²   (0.995 R² .. 0.999 R²)
mean                 7.031 ms   (6.956 ms .. 7.131 ms)
std dev              252.0 μs   (199.2 μs .. 337.1 μs)
variance introduced by outliers: 16% (moderately inflated)

benchmarking Massiv/MxM P Float - (500x800 X 800x500)/Par
time                 6.748 ms   (6.678 ms .. 6.825 ms)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 6.770 ms   (6.730 ms .. 6.828 ms)
std dev              159.1 μs   (121.9 μs .. 227.2 μs)
Man of Letters
@man_of_letters:mozilla.org
[m]
hmatrix is using many cores? is it compiled with openblas? I thought standard blas/lapack is single core only
Alexey Kuleshevich
@lehins:matrix.org
[m]
Visual depiction:
Man of Letters
@man_of_letters:mozilla.org
[m]
(standard as in the usual Ubuntu packge, at least in the old Ubuntu I have)
Alexey Kuleshevich
@lehins:matrix.org
[m]

hmatrix is using many cores?

Ye sit does

is it compiled with openblas?

Yes it is.

I thought standard blas/lapack is single core only

It is definitely multicore. It can be turned off of course: https://github.com/NixOS/nixpkgs/blob/b3d6fd4a09265b6777f02ee06ed1763d67a970bb/pkgs/development/libraries/science/math/openblas/default.nix#L13

Man of Letters
@man_of_letters:mozilla.org
[m]
(I mean these: libgsl0-dev, liblapack-dev and libatlas-base-dev)
oh, ok
openblas is not the standard one
Alexey Kuleshevich
@lehins:matrix.org
[m]
I've tried it on ubuntu and compiling against openblas did do multi core as well
Man of Letters
@man_of_letters:mozilla.org
[m]
you need a flag to hmatrix to use it, too
are we talking past each other? I meant ibgsl0-dev, liblapack-dev and libatlas-base-dev and no hmatrix flag is
"standard"
openblas and the openblas flag is "special " :)
Alexey Kuleshevich
@lehins:matrix.org
[m]
Man of Letters
@man_of_letters:mozilla.org
[m]
ASAIK, openblas is multicore, standard is not
thank you for the data, though
Alexey Kuleshevich
@lehins:matrix.org
[m]
All I can say is my htop is maxed out :D
Man of Letters
@man_of_letters:mozilla.org
[m]
yes
so, withiout this flag, it would probably be as you described earlier: massive faster
Alexey Kuleshevich
@lehins:matrix.org
[m]
Oh without that flag hmatrix is doooog slow
Man of Letters
@man_of_letters:mozilla.org
[m]
oh, I didn't know the difference is so huge; much slower than just from lack of multicore? I mean single-core performance much slower, too?
Alexey Kuleshevich
@lehins:matrix.org
[m]
The first time I tried it I could not believe that I am that good
Man of Letters
@man_of_letters:mozilla.org
[m]
haha
Alexey Kuleshevich
@lehins:matrix.org
[m]
Orders of magnitude
Man of Letters
@man_of_letters:mozilla.org
[m]
TIL
Alexey Kuleshevich
@lehins:matrix.org
[m]
I take it back. Compiling without an openblas makes hmatrix just a little bit slower. I don't remember how I got "orders of magnitude" difference, but in my defence it has been a while since I looked at it:
benchmarking HMatrix/MxM Double - (500x800 X 800x500)/Par
time                 9.140 ms   (4.980 ms .. 12.18 ms)
                     0.527 R²   (0.349 R² .. 0.651 R²)
mean                 4.794 ms   (2.775 ms .. 6.977 ms)
std dev              5.082 ms   (3.646 ms .. 5.969 ms)
variance introduced by outliers: 98% (severely inflated)

benchmarking HMatrix/MxM Float - (500x800 X 800x500)/Par
time                 6.678 ms   (4.953 ms .. 8.027 ms)
                     0.743 R²   (0.519 R² .. 0.907 R²)
mean                 6.208 ms   (5.417 ms .. 6.890 ms)
std dev              1.838 ms   (1.391 ms .. 2.718 ms)
variance introduced by outliers: 94% (severely inflated)

benchmarking Massiv/MxM P Double - (500x800 X 800x500)/Par
time                 6.874 ms   (6.559 ms .. 7.147 ms)
                     0.991 R²   (0.987 R² .. 0.995 R²)
mean                 7.485 ms   (7.157 ms .. 8.063 ms)
std dev              1.246 ms   (789.7 μs .. 1.867 ms)
variance introduced by outliers: 79% (severely inflated)

benchmarking Massiv/MxM P Float - (500x800 X 800x500)/Par
time                 6.783 ms   (6.671 ms .. 6.896 ms)
                     0.997 R²   (0.993 R² .. 0.999 R²)
mean                 6.832 ms   (6.766 ms .. 6.933 ms)
std dev              238.9 μs   (168.5 μs .. 365.5 μs)
variance introduced by outliers: 16% (moderately inflated)
Man of Letters
@man_of_letters:mozilla.org
[m]
almost 3 times slower, but you said previously it was on how many cores? 16? and this one is on a single core, I presume? a bit strange...
Alexey Kuleshevich
@lehins:matrix.org
[m]
No, during benchmark it used all cores too.
Man of Letters
@man_of_letters:mozilla.org
[m]
OTOH, "variance introduced by outliers: 98% (severely inflated)" makes it very suspect
Alexey Kuleshevich
@lehins:matrix.org
[m]

98% (severely inflated)

taht's normal for multi core benchmarks

Man of Letters
@man_of_letters:mozilla.org
[m]
oh, ok
Alexey Kuleshevich
@lehins:matrix.org
[m]
There is always a lot of noise
Man of Letters
@man_of_letters:mozilla.org
[m]
thanks again for the measurements
I think I was confused about multicore when run not with openblas, probably because I'm running with -N1 (actually, even without -threaded) --- that may be why it's single core for me
Alexey Kuleshevich
@lehins:matrix.org
[m]
My pleasure. I just got this computer so it is really fun for me to see how much faster all of the benchmarks have gotten 😀
Man of Letters
@man_of_letters:mozilla.org
[m]
while, presumably, openblas can parallelize even with -N1 (not tested)
Alexey Kuleshevich
@lehins:matrix.org
[m]
In the matter of fact hmatrix doesn't care about RTS flags like -N since it uses parallelization on C side
Man of Letters
@man_of_letters:mozilla.org
[m]
well, that's strange then, because I swear my hmatrix doesn't use many cores
(not openblas flag set)
though it probably uses SIMD, etc.
Alexey Kuleshevich
@lehins:matrix.org
[m]
From what I've seen online it is either controlled either by the env variable or som eruntime setting
Man of Letters
@man_of_letters:mozilla.org
[m]
oh, ok, again good to know
Alexey Kuleshevich
@lehins:matrix.org
[m]
Man of Letters
@man_of_letters:mozilla.org
[m]
I have an ancient Ubuntu, so probably the default flags are different
^^^ that link is openblas, though
we talking results without openblas now, right?
Alexey Kuleshevich
@lehins:matrix.org
[m]
Yeah, I don't really use any of that stuff aside from benchmarks for massiv. So I am no expert on openblas