These are chat archives for elemental/chat

27th
Oct 2016
Jack Poulson
@poulson
Oct 27 2016 02:32
@jedbrown I think I just spent a week debugging my distributed bulge-chasing only to realize it was caused by non-deterministic double-precision floating-point calculations
combined with the fact that bulge-chases are not forward stable, this is a recipe for disaster without care
I believe that the mechanism is that extra precision is often delivered but not guaranteed
and that carefully choosing compilation flags could avoid it
(as could only performing each chase step on a single process and then sharing the result)
Jack Poulson
@poulson
Oct 27 2016 03:03
Meiyue Shao confirmed that he observed this issue in his ScaLAPACK implementation
further, this issue likely effects the distributed symmetric tridiagonal QR algorithm, bidiagonal QR algorithm, and divide and conquer SVD and EVD in Elemental (due to their assumption that independent local calls to the QR algorithm are equivalent)
Ryan H. Lewis
@rhl-
Oct 27 2016 04:02
@poulson how is this being addressed within your impl?
Jack Poulson
@poulson
Oct 27 2016 04:03
one process computes and then broadcasts
(this is also advocated by Meiyue Shao et al.)
Ryan H. Lewis
@rhl-
Oct 27 2016 04:03
P
Jack Poulson
@poulson
Oct 27 2016 04:03
it is more so a problem in each iteration than as a routine as a whole
e.g., an eigensolver is much more stable than a Francis sweep
not as a rule, but empirically
Ryan H. Lewis
@rhl-
Oct 27 2016 04:05
Doesn't the bulge chase itself happen on each block diagonal?
Jack Poulson
@poulson
Oct 27 2016 04:07
using the terminology of http://dl.acm.org/citation.cfm?id=1958684, the inter-block chases involve teams of up to four processes collaborating to move a packet of bulges from one process's diagonal block to the next
I was previously redundantly performing this chase on both of the diagonal block processes, but non-determinism seemed to regularly crop up when computing with std::complex<double> (but, interestingly, not with float, std::complex<float>, or double)
Ryan H. Lewis
@rhl-
Oct 27 2016 04:10
That's cool.
Jack Poulson
@poulson
Oct 27 2016 04:11
I should be pushing this code some time tonight; hopefully I can generalize the insight to the other QR algorithms soon