Lucas I have a slightly different take on NetCDF4.
You said NetCDF is not cloud-optimized because it requires loading the entire dataset in order to access the header/metadata and retreive a chunk of data, but it doesn't -- it just requires a lot of small binary requests to access the metadata, which is inefficient. The data in NetCDF4 can be written in arbitrary Nd-chunks, just like in Zarr.
The only difference with NetCDF4 is that many chunks can be in a single file (or object) while in Zarr, each chunk is in it's own object. That was important once upon a time, but now the cloud providers allow thousands of concurrent reads to an object. So the main reason NetCDF4 doesn't perform as well as Zarr is that the metadata data is not consolidated. And that's what we are addressing with reference file system approach -- we read the metadata in advance and store it so we can read it all in one shot. Then we use the Zarr library, which can take advantage of that!
I think that the virtual dataset over many files is an even bigger deal. You can find the specific parts of the specific files you need without having to read the metadata of all of the files separately (which is very slow), and load them concurrently (you could already load the hdf5s in parallel using threading, but without concurrency).
For the “extra storage” of the references, you might want to note that there are various encoding tricks that work well, the simplest would be to zstd compress the whole json: maybe gets you a factor of 10 in size, but is super fast to unpack.