Skip to content

Fix multilayer relative imports #368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 27, 2025

Conversation

will-0
Copy link
Contributor

@will-0 will-0 commented Feb 9, 2025

Originally mentioned in this issue on the linkml repo.

When you attempt to use relative imports from a file that, itself, has been imported with a relative import, path resolution fails. This bug occurs both in linkml and linkml-runtime. In linkml-runtime, the problem occurs in schemaview.py:

if sn.startswith('.') and ':' not in i:
       i = os.path.normpath(str(Path(sn).parent / i))

If both sn and i are relative file paths, the path added to todo is an absolute path (at least on Mac). When i later than becomes sn, it's not picked up by the above clause and anything it imports does not hit the above condition and therefore just defined relative to the base file.

Marking as draft. @sneakers-the-rat noticed you were the last to edit this section, specifically here. Would appreciate input as to whether there's a reason for the above / it is expected behaviour and I'm missing something.

@sneakers-the-rat
Copy link
Contributor

can you add a set of minimal schemas that demonstrates the bug to the tests?

see:

def test_imports_relative():

https://github.com/linkml/linkml-runtime/tree/main/tests/test_utils/input/imports_relative

@sneakers-the-rat
Copy link
Contributor

related to: #350

@will-0
Copy link
Contributor Author

will-0 commented Feb 16, 2025

@sneakers-the-rat test added that fails with original implementation, and fix that passes all tests.

Noted that some other work has been done to allow relative imports without './' prefix in #350. Can try and reconcile with the above if necessary.

@will-0 will-0 marked this pull request as ready for review February 16, 2025 12:52
Copy link
Contributor

@sneakers-the-rat sneakers-the-rat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems fine, does need to be synchronized with the other changes that are being made to this exact place tho :)

Comment on lines 293 to 307
# resolve relative imports relative to the importing schema, rather than the
# origin schema. Imports can be a URI or Curie, and imports from the same
# directory don't require a ./, so if the current (sn) import is a relative
# path, and the target import doesn't have : (as in a curie or a URI)
# we prepend the relative path. This WILL make the key in the `schema_map` not
# equal to the literal text specified in the importing schema, but this is
# essential to sensible deduplication: eg. for
# - main.yaml (imports ./types.yaml, ./subdir/subschema.yaml)
# - types.yaml
# - subdir/subschema.yaml (imports ./types.yaml)
# - subdir/types.yaml
# we should treat the two `types.yaml` as separate schemas from the POV of the
# origin schema.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why strip out the comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have put this back in.

else:
temp = i
i = resolve_import(sn, i)
_ = sn, temp, i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing, apologies! Was purely for inspection during debugging, meant to remove.

Comment on lines 306 to 308
else:
temp = i
i = resolve_import(sn, i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mild suggest:

Suggested change
else:
temp = i
i = resolve_import(sn, i)
temp = i
i = resolve_import(sn, i)

everything after

if (condition):
    continue

is an else, but avoids needing to go 6 levels of indentation if we can avoid it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, have included this.

@@ -104,6 +105,19 @@ def is_absolute_path(path: str) -> bool:
drive, tail = os.path.splitdrive(norm_path)
return bool(drive and tail)

def resolve_import(sn: str, i: str) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're splitting this out to a separate function, it would be great to not have the arguments be named sn and i

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have renamed to source_sch and imported_sch

@@ -104,6 +105,19 @@ def is_absolute_path(path: str) -> bool:
drive, tail = os.path.splitdrive(norm_path)
return bool(drive and tail)

def resolve_import(sn: str, i: str) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def resolve_import(sn: str, i: str) -> str:
def _resolve_import(sn: str, i: str) -> str:

unless we expect this to be used elsewhere, probably make it private. it seems pretty specific to this one case in schemaview

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have done, thank you.

Comment on lines 114 to 128
path = os.path.normpath(str(Path(sn).parent / i))
if i.startswith(".") and not path.startswith("."):
# Above condition handles cases where both sn and i are relative paths, and path is not already relative
return f"./{path}"
else:
return path

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are already treating everything that isn't absolute or has a scheme as a path, i think this would be double-fixing the problem. e.g. tests pass with this

Suggested change
path = os.path.normpath(str(Path(sn).parent / i))
if i.startswith(".") and not path.startswith("."):
# Above condition handles cases where both sn and i are relative paths, and path is not already relative
return f"./{path}"
else:
return path
return os.path.normpath(str(Path(sn).parent / i))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was here to retain compatibility with hard-coded paths in test_imports_relative. If removed, following error raised:

test_utils/test_schemaview.py::test_imports_relative failed: def test_imports_relative():
...
>       assert closure == [
            'linkml:types',
            '../neighborhood_parent',
            'neighbor',
            '../parent',
            '../L1_0_1/L2_0_1_0/grandchild',
            '../../L0_1/L1_1_0/L2_1_0_0/apple',
            '../../L0_1/L1_1_0/L2_1_0_0/index',
            '../../L0_1/L1_1_0/L2_1_0_1/banana',
            '../../L0_1/L1_1_0/L2_1_0_1/index',
            '../../L0_1/L1_1_0/index',
            '../../L0_1/cousin',
            '../L1_0_1/dupe',
            './L2_0_0_0/child',
            './L2_0_0_1/child',
            'main'
        ]
E       AssertionError: assert ['linkml:type...0/apple', ...] == ['linkml:type...0/apple', ...]
E         
E         At index 12 diff: 'L2_0_0_0/child' != './L2_0_0_0/child'
E         Use -v to get more diff

Copy link
Contributor

@sneakers-the-rat sneakers-the-rat Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes. I do think it's useful to have all paths, internally in the imports_closure method, have a ./ or ../ in them so we can determine which need directory traversal trivially, but also acknowledge that would probably be done better with proper typing rather than string conventions and my prior impl sucks. Maybe changing the comment above to something like "ensure any path that requires directory traversal has a ./ prepended" or something like that would be good, but since the above is so anticipatory of future needs (e.g. resolving remote imports, resolving content-ID imports, other contexts where we need to differentiate posix-like directories from other formats) that it borders on YAGNI so i could go either way on what happens with this. thx for explaining

@will-0 will-0 force-pushed the fix-multilayer-relative-imports branch from 68edec3 to 4e45c49 Compare March 20, 2025 14:18
@will-0
Copy link
Contributor Author

will-0 commented Mar 20, 2025

@sneakers-the-rat thanks for the review on the above, sorry it wasn't the cleanest (written prior to running off to a long string of night shifts!). Have synchronised with upstream and addressed your comments.

Copy link
Contributor

@amc-corey-cox amc-corey-cox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

@amc-corey-cox
Copy link
Contributor

I'm going to wait to merge this until we land #345 to make sure we don't create any merge conflicts.

@sierra-moxon sierra-moxon merged commit ff6ff6f into linkml:main Mar 27, 2025
14 checks passed
@ialarmedalien
Copy link

@will-0 Please make sure that you look at other pull requests to ensure that you're not fixing a bug that has already been fixed in another PR.

@will-0
Copy link
Contributor Author

will-0 commented Mar 31, 2025

@ialarmedalien certainly, will make sure I do so more thoroughly in the future. Do you have the PR you're referring to? Did search up issues / PRs before but didn't see one at the time.

@sneakers-the-rat
Copy link
Contributor

i think they're referring to this, i think they end up being equivalent?
#350

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants