-
Notifications
You must be signed in to change notification settings - Fork 129
Open
Description
Zarr-Python version 3.1.0
just got released and the v3 arrays now support flexible datatypes. One (not yet marked stable) implementation is the numpy structs as if they were in zarr v2. Below example shows a Zarr that's created with Xarray that holds 2 variables (one struct, one normal) as zarr v3 and sharding.
My understanding is the V3 driver in tensorstore doesn't support numpy structs either. However, I believe it should be trivial because the binaries didn't change between zarr v2/v3 but only metadata definition of the same struct changed. Which means we can make v3 driver parse the zarr metadata and use the same logic to read the structured fields as in v2 driver?
What are your recommendations for implementation?
import numpy as np
import xarray as xr
dtype = np.dtype(
{
"names": ["foo", "bar"],
"formats": ["int32", "int64"],
}
)
encoding = {
"headers": {"chunks": (128, 128)},
"seismic": {"chunks": (16, 16, 16), "shards": (128, 128, 128)}
}
seis = xr.DataArray(name="seismic", dims=["inline", "crossline", "depth"], data=np.zeros((512, 512, 512), dtype="float32"))
hdr = xr.DataArray(name="headers", dims=["inline", "crossline"], data=np.zeros((512, 512), dtype=dtype))
ds = xr.Dataset({"seismic": seis, "headers": hdr})
ds.to_zarr("tmp", mode="w", zarr_format=3, encoding=encoding)
mkitti
Metadata
Metadata
Assignees
Labels
No labels