These are chat archives for elemental/chat
El::DistMultiVec. But the point of
El::DistMatrixsupports a wide variety of different data distributions and some routines are most effective in a particular one (but it would be inconvenient for users for the interface to only support a particular distribution).
El::DistMatrixReadProxyyou're looking for since
El::DistMultiVeconly supports one distribution.
El::DistMultiVec<T>is a quasi-deprecated, but not yet replaced, class that has a data distribution roughly equivalent to
El::DistMultiVec<T>is a holdover from some work from several years ago and the long-term plan is to reimplement /generalize
El::DistSparseMatrix<T>to support as many different 2D distributions as
El::DistMatrixand to delete
El::DistSparseMatrix's legacy dependencies on
El::DistMultiVecin favor of switching over to
El::DistMatrixin the process
Thanks @poulson , that was helpful indeed.
To elaborate on the particular functionality I am looking for, the problem I am dealing with is a
El::DistSparseMatrix solve, which I am content with performing using
El::LinearSolve. However, the solution vector must ideally be available to every processor in its entirety. (These values go in a non-contiguous fashion into an array)
I am aiming for scalability and using commands like
GetLocal, as mentioned by @rhl- are causing just the distribution of the solution vector to be way slower than the matrix solve itself. I would appreciate an appropriate suggestion for
1) The sparse matrix distribution
2) Method of access of solution vector : using
GetLocal or an appropriate
(The documentation mentions the presence of a more generic
ReadProxy in Sec. 3.13.1, which doesn't seem to be supported)
Also, @rhl- , I am not quite sure what you meant by using
QueueUpdate as the update here needs to be done on an array and not an Elemental object.
More than that, there seem to be way too many parameters for me to play with
1) Number of processors (ideally, this should be the only independent quantity)
2) Grid dimensions (which in turns decides the number of processors in each grid block)
3) Block size
DistSparseMatrix might prefer one kind of breakdown as opposed to the
DistMultiVec. It seems like a fine balance of parameters, which don't necessarily lead to the most optimal set for each part of the program, viz.,
a) matrix creation
b) linear solve
c) redistribution of solution
but cumulatively lead to the most efficient performance
El::DistMultiVec<T>::ProcessPullQueueroutines rather than individually calling
El::DistMatrix<T>::Get, which generally involves a broadcast
El::DistMultiVec<T>are the roughest edges in the library right now and are in the wonderful state of being deprecated but not yet replaced.
El::DistSparseMatrix<T>, with the exception of how the multifrontal tree is initialized from said matrix