@emard Thanks for pointing that out, I had not considered the routing aspect yet. So far I figured that only the last form would be recognized as synthesizable to a physical DPR16X4C block (i.e. using the LUT config bits as distributed ram) -- but my understanding of Yosys/NextPNR is too limited to be sure about this.
I don't have a lot of time right now, but I'm working through variations to make the cache design as timing independent as possible, so that it 'just works' for various bus designs. In the case of Oberon this is a little complicated as the RISC5 CPU has several (bus-)multiplexers after its registered signals, meaning that it takes until well into the clock cycle before its bus signals are stable. It does not help that the "memory not ready" input is indirectly also driving the address multiplexer leading to a logic loop in a straightforward asynchronous cache hit circuit.
The current, working Next186-based Oberon design solves this by gating the Oberonstation system clock, which has its own issues as we discovered a few weeks ago.