Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 27 17:13
    adamcallison opened #49
  • Jan 26 12:40

    jcmgray on develop

    Change logic in Evolution's __i… (compare)

  • Jan 26 12:40
    jcmgray closed #47
  • Jan 26 12:36

    jcmgray on develop

    Improving optimize_pytorch (#34… (compare)

  • Jan 26 12:36
    jcmgray closed #34
  • Jan 25 18:48
    mattorourke17 synchronize #34
  • Jan 24 15:05
    jcmgray edited #48
  • Jan 24 14:27

    jcmgray on develop

    TN Circuits: make it much easie… (compare)

  • Jan 24 14:16
    jcmgray labeled #48
  • Jan 24 14:16
    jcmgray opened #48
  • Jan 24 13:44
    adamcallison opened #47
  • Jan 24 13:01

    jcmgray on develop

    TN: accelerate simplifications … (compare)

  • Jan 22 16:28
    jcmgray edited #31
  • Jan 17 04:24
    mattorourke17 synchronize #34
  • Dec 20 2019 13:51
    jcmgray closed #36
  • Dec 20 2019 13:50
    jcmgray closed #45
  • Dec 20 2019 11:55

    jcmgray on develop

    stop hamiltonian_builder coerci… add check of immutability to te… Merge pull request #46 from ada… (compare)

  • Dec 20 2019 11:55
    jcmgray closed #46
  • Dec 19 2019 22:50
    adamcallison synchronize #46
  • Dec 19 2019 22:07
    adamcallison opened #46
Aidan Dang
@AidanGG
Hi there, this is a fantastic library. I have a couple of ideas for some contributions I'd like to make, so could I post them here if that's alright?
Johnnie Gray
@jcmgray
Hi Aidan, thanks. And of course, please do!
Aidan Dang
@AidanGG
I've gone through the docs and the source, and it appears that the MPI SVD through SLEPc is only for sparse matrices. I'd like to see if we could hook up something like ScaLAPACK for dense matrices (which I think we could implement through Cython wrappers).
In my mind, quimb could be fully distributed through opt_einsum's Dask support (which you've shown in the Google supremacy example) for contraction, and then distributing the appropriate Dask chunks for ScaLAPACK decompositions.
Aidan Dang
@AidanGG
I'm fairly familiar with C/C++ dense linear algebra libraries including ScaLAPACK and Elemental, but I'm pretty new to Dask way of parallelism, so I was wondering if my suggestion would be feasible.
Johnnie Gray
@jcmgray

This certainly sounds interesting, and distributed dense linear algebra that 'looks' the same as the sparse interface would be very nice to have. A few initial thoughts:

  1. I want to keep quimb itself pure python, so any cython wrappers would have to be an external library. There is in fact scalapy already https://github.com/jrs65/scalapy (and I even implemented a dense solver using it in quimb in a old branch you can find on github - it was unstable so I didn't keep it)
  2. Elemental is the more modern solver and might be the better bet long-term - not sure about the status of any python wrappers and whether they use mpi4py however
  3. Dask is not parallelized via MPI and thus there might be some non-trivial interaction https://blog.dask.org/2019/01/31/dask-mpi-experiment

Maybe you could flesh out what you want to do? Generate a large, distributed dense array using a tensor network approach then perform an SVD on it? Have you tried the existing dask svd implementation(s)?

Aidan Dang
@AidanGG

One example of something I'm interested in is MPS simulation (using 'split-gate' contract when applying gates) where intermediate tensors might get so big for it to require MPI. That is why I am interested in distributed contraction and decomposition.

Elemental also has a Python interface, but I haven't used it before.

The issue with the Dask SVD is that it only works for tall-skinny (short-fat) matrices, where a there is a single column (row) of chunks. I believe this arises from the use of SVD for PCA on real-life data sets. Dask does have a randomised SVD that works on general matrices, but I'd like to have the option for an exact SVD.

Aidan Dang
@AidanGG
Regarding Elemental, its SVD can be made to use either QR iteration or the divide and conquer algorithm, whereas ScaLAPACK only has QR iteration through p?gesvd.
Johnnie Gray
@jcmgray
That would definitely be cool. My feeling is that dask + MPI linear algebra is the kind of v. useful & general purpose functionality that might also be better off as a standalone module (that would be great to eventually interface to with quimb).
Starting dask.distributed workers using MPI, doing general work using the usual dask computation, but handing over to e.g. elemental for key operations like SVD seems a nice approach. Might be worth checking if people are already thinking about something like this
Aidan Dang
@AidanGG
I'm going to work on a small gist to demonstrate it, and I'll post it here when I get it working. From that Dask blog post, I don't think we need to do that part where they figure out where all the chunks reside. Instead, we might just need to schedule jobs for each particular worker to pull relevant blocks from the global matrix to construct its own local matrix, then hand it off to ScaLAPACK/Elemental.
Alan Morningstar
@aormorningstar

Hi Johnnie. If you have some time to consider this question I'd appreciate it. Since you chose to use SLEPc for eigenvalue solvers (rather than just using numpy and scipy.sparse wrappers of LAPACK and ARPACK functionality), are you aware of some benchmark comparisons between the two?

For example, diagonalizing an L=14 disordered Heisenberg model with numpy.eigh takes about 8 seconds on my machine, and with scipy.linalg.eigsh finding 50 eigenvalues in the middle of the spectrum takes about 1 second. Moving up to L=16 it takes the sparse scipy methods (ARPACK shift-invert) about 11 seconds, so I can say something like moving from numpy dense to scipy sparse methods buys me another two sites. Do you have a feeling for what moving to SLEPc dense buys? or SLEPc sparse no parallel? or SLEPc sparse + parallelization? etc. for such a benchmark problem?

Johnnie Gray
@jcmgray

Hi Alan. First of all I should note that quimb can use numpy, scipy or slepc interchangeably, using the 'backend' keyword. I don't have any rigorous or up-to-date benchmarks sadly (any contributions to the docs like this very welcome of course). That being said here is my overview.

SLEPc really specializes in sparse problems. Though the iterative method it uses seems to be a bit better than scipy for dense/implicit problems as well at medium+ sizes. Where it is dramaticall better than scipy is sparse, shift-invert problems, I'd say another two sites improvement is probably reasonable to expect. Of course, the other main motivation for using SLEPc is that you can build the hamiltonian and solve for eigenvectors all distributed on a MPI cluster. It's also only really at big sizes (16+) that the advantage of MPI parallelism increases over BLAS/threaded parallelism.

So in my opinion, if you need to find mid-spectrum eigenpairs of sparse operators, SLEPc is really worth getting set-up.

Alan Morningstar
@aormorningstar
Okay, thanks very much for the reply. This is helpful!
Aidan Dang
@AidanGG
I was a bit busy over the last week with some other stuff, but I had a play around with Scalapy and Elemental Python. Scalapy seems to have some weird performance issues, and Elemental works but its Python interface is regrettably Python 2 (and Elemental itself is unmaintained). I am having a look at implementing parallel SVD directly in Dask, based on a tiled 2-stage bidiagonalisation (https://arxiv.org/abs/1611.06892) which should be even faster than Elemental/ScaLAPACK.
Alan Morningstar
@aormorningstar
I installed quimb and all its dependencies as described on quimb site's "1. Installation" page. mpi4py, petsc, slepc are all installed properly. There were some dep. warnings when installing petsc4py and slepc4py, but they import into python so I just went forward with it. When I try to test quimb's shift-invert for sparse hamiltonians with the slepc backend by running your example from the site "MPI Interior Eigensolve with Lazy, Projected Operators" the function eigh just hangs indefinitely. Have you run into anything similar?
Alan Morningstar
@aormorningstar
or on a related note, do you know how to properly modify your installation instructions to use the 'slepc-nompi' backend? I've tried what I think is a reasonable approach after reading the petsc docs and installing a --with-mpi=0 version, but the 'slepc-nompi' backend is not working as I have set it up still
Johnnie Gray
@jcmgray
@AidanGG thanks for the update, that's a shame about elemental, seems like there is a bit of a space for modern distributed dense linear algebra library not filled by anything currently
@aormorningstar Yes the interaction of petsc/slepc/mpi4py can be a bit temperamental. To clarify, 'slepc-nompi' doesn't need a seperate build of slepc, it just doens't spawn any MPI workers and so instead relies on the threaded parallelism of whichever BLAS slepc is linked to
If the slepc and mpi4py tests are passing then trying backend='slepc-nompi' would be the first thing to check, then launching the program in MPI but using the 'syncro' mode. Which version of MPI do you have installed? Openmpi 1.10.7 seems to be one of the few version that can reliably spawn processes
Alan Morningstar
@aormorningstar
@jcmgray okay I will have to retry this with Openmpi 1.10.7 and see if that is the problem. I will let you know if things get cleared up in case others have the same problem in the future.
Aidan Dang
@AidanGG
If it's of any interest to you @jcmgray , there's this paper that just came out: https://arxiv.org/abs/1905.08394 . I'm of the opinion that Quimb + opt_einsum + Dask would do it just as well given the same hardware though.
Johnnie Gray
@jcmgray
Thanks yeah I saw that! I'm actually writing a paper currently on very high quality contraction path finders that make that whole computation even easier (and is ofc compatible with opt_einsum/quimb)
Aidan Dang
@AidanGG
I see you got a mention on the latest Scott Aaronson blog post :) I am still working on implementing general non tall-skinny Dask SVD. I've learned a lot about tiled linear algebra in the last couple of months.
Johnnie Gray
@jcmgray
Ha yes! Paper and code out soon hopefully. That's good to hear you are working on dask SVD. It's such a convenient lazy/distributed backed, though I'm still not totally sure it can be scaled up well, for instance, it can't handle contractions with 32+ dimensions yet. Do you know the cyclops tensor framework? It has a numpy like API and should be mostly compatible with quimb already via autoray now. I have tested it in fact briefly using MPI to contract quantum circuits. I think the SVD it uses is scalapack however.
Aidan Dang
@AidanGG
I knew of CTF during my masters days a couple of years ago but it didn't have SVD back then, which is why I wrote my own MPS code using Elemental. Does it seem reasonable that we could do MPI TN calculations through quimb + autoray + CTF in Python? If so, it might be worth me plugging in a faster MPI SVD directly into CTF than continuing on with Dask.
Aidan Dang
@AidanGG
The SVD I am working on for Dask is based on the polar decomposition https://doi.org/10.1137/120876605 which is a much better fit for distributed task-scheduling than the divide-and-conquer approach (which Lapack gesdd, Eigen BDCSVD and Elemental's SVD are based on). In principle, it should be very easy to hook up an MPI implementation of this new SVD (https://github.com/ecrc/ksvd) to CTF since the function signature is almost identical to Scalapack's.
Johnnie Gray
@jcmgray
quimb + autoray + cyclops is absolutely one of the combinations I'll be targeting (I might even open a pull-request to support ctf in opt_einsum now since it's just a one-liner). The idea of autoray is basically that most array operations and specifically tensor network operations can be defined v simply so supporting numpy/tensorflow/cyclops should be effortless and in principle I don't see any reason that good performance cannot be achieved. In a few months time I'll be trying this stuff properly but until then any additional attempts obviously welcome as well!
With regard to implementing a good, distributed, dense SVD in dask or ctf, I think both would be super useful. I guess ctf might be the more natural place given I assume communication performance and thus using MPI is relatively important.
Johnnie Gray
@jcmgray
My experience so far, is that for simply contracting large tensors ctf performs not much slower than usual numpy/blas, but the SVD was very slow (I compiled with OpenBLAS rather than Intel which has a custom scalapack implementation so that might be a factor). It may also be worth checking my claim that the SVD used really is that from scalapack!
Johnnie Gray
@jcmgray
... in fact, opt_einsum does not need any changes and already supports ctf automatically.
Johnnie Gray
@jcmgray
Finally, I'll also mention mars https://github.com/mars-project/mars in case you have not seen it. Very similar to dask in lots of ways, but unlike dask supports chunking in both dimensions for SVD currently. Not sure about its general performance! But follows numpy api so will work with autoray nicely.
Aidan Dang
@AidanGG
Thanks for your thoughts. One other thing I found that required Elemental was that non-MKL implementations of Scalapack do not have support for 64-bit integer indexing, leaving a relatively small cap on the matrix sizes. This was something I was hoping to avoid by going to Python.
Aidan Dang
@AidanGG
I briefly looked through the Mars source and found it uses the same tall-skinny algorithm for QR and SVD as Dask. I'm trying to find out how it applies it to 2d-chunked matrices too.
Aidan Dang
@AidanGG
It looks like mars just rechunks it automatically to have one column chunk, before applying TSQR or TSSVD. It's certainly not ideal for very large almost-square matrices.
Aidan Dang
@AidanGG
It also looks like CTF does not support the 64-bit int indexing in MKL Scalapack. So unfortunately there is no SVD implementation that fulfils all my needs for performance, distributed parallelism, 64-bit integer indexing, support and has a Python interface. I think for now I will continue working on my SVD in Dask, but raise an issue with the CTF developers with regard to SVD.
Johnnie Gray
@jcmgray
That makes sense regarding mars. And yes annoying nothing quite meets all those requirements - which I suppose is why you're working on it! Is the limitation for 32-bit indexing simply that dimensions all have to be smaller than 2^32? Let me know if you want any input/changes from the quimb side of things.
Aidan Dang
@AidanGG
That's right for the 32-bit indexing. It applies even to the global dimension of a Scalapack matrix (due to the scalapack descriptors using ints). In Dask though, the individual chunks would all definitely be smaller than this limit, and the indexing into individual chunks is handled by Python ints.
Daniel Alejandro Perdomo Escobar
@Dalperdomoe
Hello everyone, I am new to tensor networks and I would like to know if quimb can be used to solve ML tasks in the way Stoudenmire shows, using the dmrg algorithm to optimize the MPS tensors. I saw that in quimb you can use dmrg but I don't know if I can use a loss function in it
Johnnie Gray
@jcmgray
Hi Daniel, the DMRG algorithm probably doesn't work out of the box with that method, but I don't suppose it would be super tricky to modify it, since it tries to be quite general. Note there are also direct pytorch/tensorflow/autograd+scipy/jax+scipy tensor network optimizers in quimb, which might be a easier way to get started!
Aidan Dang
@AidanGG
I believe TensorFlow should be able to backpropagate through SVD according to the rules in https://journals.aps.org/prx/abstract/10.1103/PhysRevX.9.031041 so there might not be a need for DMRG in this case.