Skip to content

Conversation

@will-moore
Copy link
Member

@will-moore will-moore commented Nov 7, 2024

This updates ome-zarr-py to use zarr-python v3 but doesn't include support for writing Zarr v3 (OME-Zarr v0.5).
However, it does add support for reading OME-Zarr v0.5 (e.g. for use by napari-ome-zarr).

To test:

Install from this branch: $ cd ome-zarr-py && pip install -e . and update zarr to v3.0.8 $ pip install -U zarr.

See the changes in docs/source/python.rst for changes needed for writing methods (need to specify v0.4 since we don't support writing the latest OME-Zarr v0.5 version).

Try reading v0.5 data in napari:

  • install napari
  • $ pip install napari-ome-zarr
  • Install this branch and update zarr as above if necessary

Then try various v0.5 examples from e.g. https://idr.github.io/ome-ngff-samples/ or https://ome.github.io/ome2024-ngff-challenge/ (click on thumbnail and copy the url field in the popup. NB: if the image is bioformats2raw layout, you'll need to add /0 to the zarr url. If in doubt, open in Validator first.
Most of the IDR samples there are plates. E.g:

$ napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/share/ome2024-ngff-challenge/idr0011/Plate4-TS-Blue-B.ome.zarr

In several places I've hard-coded zarr_version=2. This works because you can do zarr.open_group(store=self.__store, path="/", zarr_version=2) in parse_url() to create a group to write into, or find a group to read from without specifying whether you expect the group to already exist.

Without the zarr_version=2, zarr will create zarr.json if the mode of the store is w.

NB: I have another PR (in progress) to add support for writing OME-Zarr v0.5 at #413

@will-moore
Copy link
Member Author

Zarr v3 is not supported on python 3.9 or 3.10. I'll remove them from the build...

Build on python 3.12 is failing tests with:

  = 106 failed, 282 passed, 3 skipped, 6 xfailed, 2270 warnings in 75.86s (0:01:15) =

@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/ome-zarr-py-development-status/104671/4

@joshmoore
Copy link
Member

@will-moore: what's the next step here?

@will-moore
Copy link
Member Author

@joshmoore: With those changes, you now are forced to specify fmt=FormatV04() with parse_url() if mode="w".
E.g. like this...

import numpy as np
import zarr
from ome_zarr.io import parse_url
from ome_zarr.writer import write_image
from ome_zarr.format import FormatV04

data = np.random.default_rng(0).poisson(lam=10, size=(10, 256, 256)).astype(np.uint8)
store = parse_url("image.zarr", mode="w", fmt=FormatV04()).store
root = zarr.group(store=store)
write_image(image=data, group=root, axes="zyx", storage_options=dict(chunks=(1, 64, 64)))

That code works, but unfortunately we end up with data that looks like v0.4 apart from it contains "version": "0.5", because write_image() defaults to using CurrentFormat(). The FormatV05 generates exactly the same output as FormatV04 except for the version.

After testing a matrix of using different versions for parse_url() and for write_image() with this branch, it seems that the format passed to parse_url() has no effect.

Testing the same matrix of versions with released ome-zarr-py produces all valid output except when you use FormatV01 in parse_url(): in this case all the data is written with dimension_separator: "." so the v0.2, v0.3, v0.4 data is invalid (when you use a later version to write the data).

To summarise, we probably ALSO want to prevent usage of write_image(), write_multiscales() without specifying fmt AND ensure that uses use FormatV04 or earlier. Do we change the method signature so that fmt is no-longer optional?And raise NotImplementedError if the fmt is not FormatV04 or earlier?

So users must then do:

store = parse_url("image.zarr", mode="w", fmt=FormatV04()).store
root = zarr.group(store=store)
write_image(image=data, group=root, axes="zyx", fmt=FormatV04(), storage_options=dict(chunks=(1, 64, 64)))

Does this look acceptable?

@imagejan
Copy link
Contributor

Sorry for chiming in after only superficially following; my opinion might be naive, but:

store = parse_url("image.zarr", mode="w", fmt=FormatV04()).store
root = zarr.group(store=store)
write_image(image=data, group=root, axes="zyx", fmt=FormatV04(), storage_options=dict(chunks=(1, 64, 64)))

For me, this screams for an object-oriented API where I can get some omezarr= OMEZarrFormat(version="0.4") that I then can ask to omezarr.parse_url(...) and omezarr.write_image(...), doesn't it?

@will-moore
Copy link
Member Author

Thanks for the suggestion @imagejan. This certainly seems quite painful to ask for the format twice, especially since it's not really used in parse_url() except for the version number checking that was just added.

So, let's move the version checking towrite_image() and other similar methods instead of parse_url().

Then we can just do:

store = parse_url("image.zarr", mode="w").store
root = zarr.group(store=store)
write_image(image=data, group=root, axes="zyx", fmt=FormatV04(), storage_options=dict(chunks=(1, 64, 64)))

@joshmoore
Copy link
Member

👍 for the update

Do we change the method signature so that fmt is no-longer optional?And raise NotImplementedError if the fmt is not FormatV04 or earlier?

Not sure how much of your questions hold, but I'd say Yes!. Let's favor breaking the API over breaking data.

@will-moore
Copy link
Member Author

With zarr v3.0.8, and this current PR, I was testing the docs examples, e.g. for adding omero metadata at https://ome-zarr.readthedocs.io/en/stable/python.html#rendering-settings

Actually, that fails in zarr v2 - you can't create a group where one exists already. Should have tested that better!

However, in Zarr v3 with this branch, when you do this:

store = parse_url(path, mode="w")

it completely replaces EVERYTHING at path with an empty zarr group!
This is pretty scary!

I'll fix the docs for the released version, but I wonder if we should add checks in to parse_url() to prevent that?

@joshmoore
Copy link
Member

This is pretty scary! I'll fix the docs for the released version, but I wonder if we should add checks in to parse_url() to prevent that?

Definitely. Do you think there's a lingering bug in zarr-python or more just a surprise?

@will-moore
Copy link
Member Author

will-moore commented May 28, 2025

This is what we're doing under parse_url(path, mode="w")...

store = LocalStore(path, read_only=False)
# this wipes the store
group = zarr.open_group(store=store, path="/", mode="w", zarr_format=2)

EDIT:
In 19f2f62 this behaviour is fixed. Now we only do group = zarr.open_group(store=store, path="/", mode="w", zarr_format=2) if the group doesn't already exist.

@will-moore
Copy link
Member Author

@joshmoore (and others): I think I'm done with those changes now, fixed the write/delete issue (see above), added a test for v 0.5 reading, updated python docs etc. So I'm waiting on feedback or the "go ahead" to release 0.12.0rc1.

@will-moore
Copy link
Member Author

Over at https://imagesc.zulipchat.com/#narrow/channel/328251-NGFF/topic/.E2.9C.94.20Non.20web-based.20viewers.20for.20v0.2E5.20stores/near/520091410 "[this] PR worked right away on a ~1TB v0.5 store". With napari-ome-zarr on lightsheet data. 👍

@psobolewskiPhD
Copy link
Contributor

I tested this in a typical napari 0.6.1 environment with tifffile, napari-tiff, zarr3 (for local whole-slide) and napari-ome-zarr (for remote). Everything as compatible, so no install issues.
With our 0.5 ome-zarr and legacy 0.4 ome-zarr everything worked! With the later (the 0.4 ome-zarr), i did get this warning:
WARNING : MainThread : version mismatch: detected: FormatV04, requested: FormatV05
But the other warnings I reported upstream in this thread are no longer occurring.

@will-moore
Copy link
Member Author

I was looking at the next step of support for writing OME-Zarr v0.5 (zarr v3) over at #413 (comment) and ran into issues with mixing of zarr v2 and zarr v3.

To summarise, when writing OME-Zarr, I think it will work better to specify the version when you create the store with parse_url() and then use the zarr_format of the store (zarr v2 or zarr v3) to inform the version of OME-Zarr that is written with write_image() etc.
This should allow you to specify the OME-Zarr version in just one place and also avoid mixing zarr v2 and zarr v3.
This will look like the first code sample in #404 (comment) above.

So I'm going to go ahead and give that a try. Apologies for the back and forth on this....

@joshmoore
Copy link
Member

Thanks all to the testing here! Since this has lingered a bit, #413 (as a superset of this PR) is now almost ready for testing with write support. I'm going to close this PR so we can all focus on #413. Plan will be to get an RC out ASAP. 🎉

@joshmoore joshmoore closed this Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants