Skip to content

normalize_lexically should mention other functions that do basic normalization #142931

@lolbinarycat

Description

@lolbinarycat
Contributor

Location

https://doc.rust-lang.org/nightly/std/path/struct.Path.html#method.normalize_lexically

Summary

both the PartialEq impl for Path and Path::components will normalize away non-leading . and duplicate/trailing slashes. This is "good enough" for many applications, such as choosing a specific file in a tarball to extract.

Activity

added
A-docsArea: Documentation for any part of the project, including the compiler, standard library, and tools
on Jun 23, 2025
added
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Jun 23, 2025
added
T-libsRelevant to the library team, which will review and decide on the PR/issue.
and removed
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Jun 24, 2025
the8472

the8472 commented on Jun 24, 2025

@the8472
Member

But why? And why only for normalize_lexically? canonicalize doesn't point to it. components doesn't point to canonicalize... we in general don't have a forest of methods explaining what all the other methods do.

components() calls what it does "a small amount of normalization", it seems barely worth a mention. And afaik the tar format permits .., so this bit of normalization isn't always sufficient for that either.

I guess we could have a section on path normalization on top that that mentions all 3 approaches?

lolbinarycat

lolbinarycat commented on Jun 24, 2025

@lolbinarycat
ContributorAuthor

normalize_lexically already lists 2 different alternatives. I believe the rationale is that it is frequently the wrong operation to perform, as it does not handle symlinks correctly, so it makes sense to list functions that don't have this interaction with symlinks.

the8472

the8472 commented on Jun 24, 2025

@the8472
Member

All the options have upsides and downsides. I don't think it makes sense to have normalize_lexically be the central point to collect them.

lolbinarycat

lolbinarycat commented on Jun 24, 2025

@lolbinarycat
ContributorAuthor

I mean, I would be all for a dedicated section on path normalization, then having all the relevant functions just link link to that. That's basically what I did with the pointer to reference conversion section.

ChrisDenton

ChrisDenton commented on Jun 25, 2025

@ChrisDenton
Member

normalize_lexically is "special" in that it's currently the only one that removes .. components on Unix without resolving them, which may be a security concern. This should be given due attention. Contrasting it with what components() does makes sense to me because path functions are typically written in terms of components() (even if they don't literally call components then they still work as-if components was used).

I agree this isn't the place to have a full overview of normalisation functions but it would indeed be good to have that overview somewhere. Though it's kind of tricky with unstable functions because we have to be careful mentioning them in otherwise stable documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-docsArea: Documentation for any part of the project, including the compiler, standard library, and toolsT-libsRelevant to the library team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @the8472@ChrisDenton@lolbinarycat@jieyouxu@rustbot

      Issue actions

        normalize_lexically should mention other functions that do basic normalization · Issue #142931 · rust-lang/rust