what are you doing to ensure safety of the reference to the engine?

Nothing can be done. The docs should be noted, that it hold a reference and it is unsafe to pass it outside from a function.

@anton-dutov Yes ndslice is slower for very small matrixes like 3x3 and 4x4. See also . The new ndslice is faster then the old ndslice because it optionally may not have strides. But the length are not known at compile time, so fixed sized types are faster