Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • May 14 01:33
    poulson closed #276
  • May 14 01:33
    poulson commented #276
  • May 14 01:30

    poulson on master

    Update README.md (compare)

  • May 14 01:24

    poulson on master

    added logo (compare)

  • May 10 21:37
    jedbrown commented #276
  • May 10 21:25
    poulson commented #276
  • May 10 21:21
    jedbrown commented #276
  • May 10 17:08
    poulson commented #276
  • May 10 17:03
    tesch1 opened #276
  • Apr 10 13:18
    JM1 commented #275
  • Apr 10 13:16
    JM1 synchronize #275
  • Apr 10 08:56
    JM1 opened #275
  • Mar 06 03:47
    Raviteja1996 closed #274
  • Mar 05 05:46
    Raviteja1996 opened #274
  • Feb 11 21:53
    BenBrock commented #228
  • Feb 11 21:52
    BenBrock commented #228
  • Feb 11 21:51
    poulson commented #228
  • Feb 11 21:50
    poulson commented #228
  • Feb 11 21:45
    BenBrock commented #228
  • Jan 23 23:57
    adambaskerville commented #273
Andreas Noack
@andreasnoack
So it looks like jeffhammond/BigMPI#38 hit something similar and I was able to fix this by explicitly setting CMAKE_C(XX)_COMPILER=mpic(c/xx) as suggested in https://github.com/jeffhammond/BigMPI/issues/38#issuecomment-311779618. Shouldn't Cmake handle this automatically when detecting MPI?
Jack Poulson
@poulson
The CMake build is beyond a Rube Goldberg machine and should not be assumed to be sane.
Matthias Redies
@MRedies
Are there any auxillary routines, that calculate local indicies to global indicies and vice versa? I couldn't find anything in the documentary.
Jack Poulson
@poulson
@MRedies The following routines should serve that purpose:
Alex Gittens
@alexgittens
I'm GEMMing two <VR, STAR> matrices A*B, where A is 400GB and B is 1MB. I have more than a terabyte of memory, but this GEMM gives me OOM errors. Any reason why this should be the case? Do I need to relay out the matrices for some reason first?
A is 6177583-by-8096, V is 8096-by-20, the resulting matrix should be 6177583-by-20
(V is B)
Alex Gittens
@alexgittens
I tried relaying out every matrix to MC,MR explicitly, and I'm getting OOM errors when relaying out A by creating a new matrix with MC,MR and copying A into it.
so I guess the question now is, what is the memory cost of relaying out a matrix from VR,STAR to MC, MR? I'd like to think that as long as I can hold two copies of the matrix in memory, it should be fine.
Jack Poulson
@poulson
There is likely a redistribution behind the scenes, which can be memory hungry. You probably want to gather B into a STAR,STAR and call a local multiply then redistribute the result as desired.
Alex Gittens
@alexgittens
Thanks, I'll give that a try
Alex Gittens
@alexgittens
Another question: I want to apply an arbitrary permutation to the rows of a DistMatrix. It seems this is done with the DistPermutation class, and I must explicitly factorize the permutation into a product of transpositions, then use the DistPermutation.swap function to update the permutation, then use DistPermutation.permuteRows. Is that accurate, or is there a way to avoid manually factorizing the transposition?
Also, is there some sort of communication optimization done on the swaps before the permutation is applied, or do I have to be careful because all the swaps are done as I entered them?
Jack Poulson
@poulson
The swaps are applied in a reasonably efficient manner using a single AllToAll.
You can explicitly define the permutation by its action on each input with SetImage: https://github.com/elemental/Elemental/blob/master/include/El/core/DistPermutation.hpp
Alex Gittens
@alexgittens
great, thanks. I saw that but wasn't sure if it was modified by further calls to setImage
Walter Landry
@wlandry
I there a way to multiply a vector and matrix elementwise? So something like A(i,j)=M(i,j)*V(j).
Walter Landry
@wlandry
Maybe DiagonalScale() is what I am looking for?
Walter Landry
@wlandry
DiagonalScale() seemed to work. It turns out that I also need elementwise multiplication of two matrices. That looks hidden in Hadamard(). Looking at the definition of Hadamard in Wikipedia leaves me confused. Is Hadamard() supposed to be doing something else?
arpangujarati
@arpangujarati
Is there a way to parallelize the El::IndexDependentFill(...) function in order to initialize a huge matrix?
Ali Vaziri
@avaziria
Hi, does Elemental do anything special internally if the number of elements in mpi communication exceeds 2 Billion? This is regarding the mpi limitation that count is defend as int (as opposed to long long int).
Belliger
@Belliger
Good morning! I'm having a few issues with getting Elemental to work with python, I've installed it (seemingly) okay, and set my LD_LIBRARY_PATH and PYTHONPATH, but when I go to import the module I get an error message: "OSError: libpmrrr.so.0: cannot open shared object file: No such file or directory", I've found the file in question in lib64/ and fixed this error by setting lib64 in my LD_LIBRARY_PATH too, but now it comes up with another error: "OSError: libmpi.so.20: cannot open shared object file: No such file or directory", this time I can't seem to find this file anywhere, any tips?
Belliger
@Belliger
Hmm, okay I've found the file in "/gpfs/ts0/shared/software/openmpi/2.0.0/gnu/4.8.5/lib/libmpi.so.20" and something in the "include/El/config.h" file pointing to "#define EL_MPI_C_LIBRARIES "/gpfs/ts0/shared/software/openmpi/2.0.0/gnu/4.8.5/lib/libmpi.so"" but can't figure out how to fix the issue at the moment.
Belliger
@Belliger
Okay, I've fixed that error by loading the correct module, but now I'm getting an error message when I try to import El: "Fatal error in PMPI_Comm_dup: Invalid communicator, error stack:
PMPI_Comm_dup(192): MPI_Comm_dup(comm=0x34b09c30, new_comm=0x281dc38) failed
PMPI_Comm_dup(144): Invalid communicator"
Belliger
@Belliger
Aha! I think I've fixed it - I had to recompile python using the same compiler I used to compile Elemental.
Jack Poulson
@poulson

Sorry; I have not been active on this project in a long time but will try to respond to the extreme backlog.
@wlandry Yes, DiagonalScale (from the left) can be used for A(i, j) := d(i) * A(i, j) or (from the right) A(i, j) := A(i, j) * d(j). There are equivalent versions for triangular and Hessenberg matrices. And Hadamard performs an elementwise multiply (i.e., C(i, j) = A(i, j) * B(i, j)).

@arpangujarati El::IndexDependentFill should be automatically parallelized over each process's local matrix if you are using a distributed matrix. But I am guessing that you are thinking of a local matrix and hoping for threaded parallelism. Threading was never a part of Elemental's focus.

@avaziria @jeffhammond worked on something called BigMPI to help with this but I do not recall the status. Maybe (probably) he could say more.

@Belliger Sorry to see all of the trouble: it is strongly preferred to use the auto-generated CMake (or, as a fallback, Make) files generated during Elemental's configuration to build your project. Many toolchains are unfortunately entirely incompatible.

Jack Poulson
@poulson
Also, since someone offline asked me to clarify: I am not working on this project anymore but will answer questions.
Walter Landry
@wlandry
When I compute only eigenvalues with HermitianEig() with high precision (1024 bits), I get inaccurate results (off by 1e-9). When I compute both eigenvalues and eigenvectors, I get the accurate answer. The matrix is not particularly badly conditioned (condition number = 30). Has anyone else seen this kind of behavior?
Walter Landry
@wlandry
I noticed that (numIterations,numAlternations,numCubicIterations) is (1990,6,771) for eigenvalues only, and (1974,7,1061) for both eigenvalues and eigenvectors. So it is definitely doing something differently.
Also, computing both is significantly slower than computing only eigenvalues.
Walter Landry
@wlandry
I think I figured out the problem. DivideAndConquer() calls itself recursively and then calls Merge() to merge the results. When not computing eigenvectors, Merge() expects to get the last eigenvector of the first submatrix and the first eigenvector of the last second. However, the output of Merge() is the middle two eigenvectors, not the first and last eigenvectors. So everything works fine if there is only one level of recursion, but it breaks down with two levels.
So my workaround is either to always compute eigenvectors or to increase hermitian_eig_ctrl.tridiagEigCtrl.dcCtrl.cutoff to greater than half of the size of my matrix.
I looked into actually fixing this, but it is unclear to me why Merge() likes having those particular eigenvalues. There is a lot of manipulation of indices that would take me a while to figure out.
Jack Poulson
@poulson
Hi Walter: that is very good to know! Would you mind filing a GitHub issue for this? There is probably an easy fix that will require removing an assumption in the eigenvalue only version.
Edward Vigmond
@vigmond
Hi. Has anyone compiled Elemental under gcc7? I upgraded my OS to OpenSuse Leap 15.0 and gcc7 is the default.
Walter Landry
@wlandry
I compile with gcc 5.2.0 and 8.2.0. I am not sure, but I might have had to do something for 8.2.0.
Walter Landry
@wlandry
@vigmond The fixes I made for gcc 8 are on my gitlab fork bootstrapcollaboration/elemental@64b3642
I would not recommend my fork for general use. It breaks many things for arbitrary precision numbers.
Walter Landry
@wlandry
@poulson FYI: I submitted elemental/Elemental#268 for the eigenvalue bug.
Adam Baskerville
@AmusingYeti_gitlab

This is most likely a very simple question for the experts here, but I am new to using non-standard data types in C++. Part of my code produces numbers which are defined using the mpreal data type from MPFR (this cannot be changed unfortunately). I would like to redefine these as DoubleDouble from the QD library in order to build a matrix for use in the Elemental library. I cannot seem to figure out how to do this.

Another question I have is about the performance of Elemental at higher than double precision? Obviously the performance will take a big hit but I moved to Elemental from the Eigen library as Eigen does not play well with custom types (mpreal) as re-allocations for these types reduce the speed by a factor of 100+. Has anyone had experience using Elemental (specifically generalized eigenvalue calculation) at higher precision (DoubleDouble, QuadDouble etc...)

Walter Landry
@wlandry
I am using Elemental with very high precision (~1000 bits). MPFR is about a factor of two slower than GMP, so I ended up modifying Elemental to use GMP. It is much, much slower than IEEE double's. My guess is a factor of 100?
Walter Landry
@wlandry
@AmusingYeti_gitlab It looks like the QuadDouble constructor accepts a char*, but not a std::string.
Walter Landry
@wlandry
Does anyone know how I could AllReduce a local Matrix with the result going into a DistMatrix without making too many copies? If I AllReduce into the same matrix, it looks like it ends up using space equivalent to 2 copies. I am guessing that this is for send and receive buffers? Ideally, I could only send elements to the process that holds the local data for that rank. But it looks like I would have to write that myself. Is there a better way?
Walter Landry
@wlandry
FYI, I ended up implementing it myself.
Dan Cajumban II
@DANdurunduh_twitter
Hi!
Spencer Bliven
@sbliven
Hello!
Are there any benchmarks on how Elemental's diagonalization algorithms scale?
I'm looking for a library to diagonalize large (N=10^7) dense matrices.
I found some citations for Nebot-Gil and Davidson algorithms that seem to be able to handle that scale