Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix coo_map when only one match from regex pattern #525

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

aaTman
Copy link

@aaTman aaTman commented Oct 30, 2024

I ran into an issue when using coo_map in MultiZarrToZarr that was caused by using .groups()[0] instead of .group(), where there is only one returned regex match in the string (at least, as far as I can tell is the case).

To fix this, I kept the original functionality but would catch if the .groups() tuple was empty. In the situation I ran into this bug, I used the following code which caused an IndexError due to a length 0 tuple:

pattern = re.compile(r"[A-Za-z]\d\d(?![^ ]*[\\\/])", re.IGNORECASE)
file_list = glob.glob(f"{self.directory}/*")
mzz = MultiZarrToZarr(
    file_list,
    coo_map={"member": pattern},
    concat_dims=["member", "step", "time"],
    identical_dims=["latitude", "longitude"],
)
multi_kerchunk = mzz.translate()

The string (and similar ones) that returned only one matching pattern to the regex was most recently:

/var/folders/vz/txd62qzn76g9f6cxg_8_76cw0000gn/T/tmp_kwkysl7/pres_msl_2002110200_p03_01.json

Where my goal was to subset the "p03" and other ensemble members from the GEFSv12 Retrospective data.

It seems this completely fixed the issue though happy to discuss or try some other edge cases.

@martindurant
Copy link
Member

Your interpretation appears to be right, and I am surprised that your can have a group() without having anything output by groups().

@martindurant
Copy link
Member

These changes I think should fix the datatree thing, as well as requiring the current version of xarray: https://github.com/fsspec/kerchunk/pull/523/files#diff-65c089582c22ac69bff6e677f8c5dcf10b650de2fac7193ed15b013c7866925d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants