Skip to content

Issues of the current isInDir implementation #15207

@liquidnya

Description

@liquidnya

Hello,

I was looking into how nix is implementing the isInDir function, because I wanted to check if a path that was passed to quickshell is in the nix store or not.
It turns out that this is not a trivial task.
This is the PR in nixpkgs where I was analyzing if a path is a sub path of the nix store: NixOS/nixpkgs#483139 (comment)

The current implementation isInDir
https://github.com/NixOS/nix/blob/d4a00241841daf35062c69e2bc46bcb7f6f653e1/src/libutil/file-system.cc#L181C6-L189

bool isInDir(const std::filesystem::path & path, const std::filesystem::path & dir)
{
    /* Note that while the standard doesn't guarantee this, the
      `lexically_*` functions should do no IO and not throw. */
    auto rel = path.lexically_relative(dir);
    /* Method from
       https://stackoverflow.com/questions/62503197/check-if-path-contains-another-in-c++ */
    return !rel.empty() && rel.native()[0] != OS_STR('.');
}

uses a modified version of
https://stackoverflow.com/questions/62503197/check-if-path-contains-another-in-c++.
The difference is that lexically_relative is used instead of relative, which means that symlinks are not resolved (which is probably wanted here, because the code talks about no IO), and also . and .. are not resolved and there are further issues as well.
Now since the path is not canonicalized or weakly canonicalized, the following runs without any errors:

assert(!isInDir("/home/nya/../../nix/store/sub-folder", "/nix/store")); // which is incorrect
assert(!isInDir("/home/nya/.ssh", "/home/nya")); // which is incorrect
assert(!isInDir("/home/nya/...", "/home/nya")); // which is incorrect (e.g. `touch $HOME/...` is a regular file)

I am not sure if those cases should be corrected.
If they do not need to be corrected, then I suggest using an implementation like this one:

bool isInDir(
    const std::filesystem::path & path,
    const std::filesystem::path & dir
) {
    if (dir.empty() || dir == ".") {
        return !path.empty() && path.native()[0] != OS_STR('.') && path.native()[0] != OS_STR('/');
    }
    const auto result = std::ranges::mismatch(path, dir);
    if (result.in2 != dir.end())
        return false;
    if (result.in1 == path.end())
        return false;
    const auto & p = *result.in1;
    return !p.empty() && p.native()[0] != OS_STR('.');
}

This should be the exact same behavior like the current implementation of isInDir, just without any memory allocations and is probably faster too, but I have not done performance tests. I have only done tests using rapidcheck to check that the new implementation behaves exactly the same and verify that it does not do any memory allocations.

Either way, I'd be happy to create a PR for this issue and if you are taking the code I suggested here, I would appreciate to be the Author or the Co-Author of the commit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions