These are chat archives for elemental/chat

Nov 2016
Jack Poulson
Nov 02 2016 01:08
no worries; in my tests (after ensuring that OMP_NUM_THREADS=1), out of reduction to bidiagonal form, bidiagonal SVD, and backtransformation, the reduction to bidiagonal form is modestly slowing down as I increase from two to four MPI processes while the others are slightly accelerating
the reduction to bidiagonal form is bandwidth limited for most of its work (matrix-vector products), so this isn't necessarily unexpected
it's also worth noting that Elemental calls a sequential algorithm when only one MPI process is used (which avoids several copies that are no longer needed)
Zoltán Csáti
Nov 02 2016 08:35
I didn't complete the test with 4 processes because I saw the tendency and even Bidiag with one process took a long time, it would have taken considerably more.
With 2 processes, it was a bit faster than with 4, but slower than with 1. I set OMP_NUM_THREADS=1 and OPENBLAS_NUM_THREADS=1 before running these tests.
Jack Poulson
Nov 02 2016 14:40
its worth keeping in mind that intranodal scalability can be quite different than internodal scalability due to memory bandwidth constraints
this is a common discussion for PETSc
Zoltán Csáti
Nov 02 2016 15:14
@poulson Ok, thanks to assure me that it's normal. Do you recommend me to start with Elemental or use it from PETSc? I will start a C++ project from the ground.
Ryan H. Lewis
Nov 02 2016 16:41
Use elemental of course!
And contribute back :)