Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atmos vendor pull behaves differently than terraform init for subdirectory based modules #617

Open
mss opened this issue Jun 3, 2024 · 10 comments
Labels
bug 🐛 An issue with the system

Comments

@mss
Copy link

mss commented Jun 3, 2024

Describe the Bug

I must admit that this is a slightly esoteric use case and thus maybe some documentation at https://atmos.tools/cli/commands/vendor/pull#description would be sufficient.

Let's assume that we have some module source with the URL git::ssh://[email protected]/tf/tf-modules.git//modules/account-vpc?ref=v0.1.0 (note the double-slash syntax to use the given subdirectory).

For some reason are certain users or processes not able to access the repository via SSH but need to use HTTPS instead. Since it only affects this one server they add some config to ~/.gitconfig like this:

[includeIf "gitdir:~/code/work/"]
    path = ~/.config/git/work.config

And that work.config file contains something like this:

[url "https://git.example.com/scm/"]
    insteadOf = ssh://[email protected]/

This configuration works and the modules are pulled via HTTPS instead of SSH if one creates a plain old Terraform root module and one calls terraform init.

Now we want to use Atmos vendoring and add vendor.yaml:

apiVersion: atmos/v1
kind: AtmosVendorConfig
metadata:
  name: account-vpc
  description: account components
spec:
  sources:
    - component: 'account-vpc-v0.1.0'
      source: 'git::ssh://[email protected]/tf/tf-modules.git//modules/account-vpc?ref=v0.1.0'
      targets:
        - 'components/terraform/account-vpc/v0.1.0'

This won't use the given mirror but will (try to) use the original URL which may fail due to whatever networking issues are the reason the config was added in the first place.

Expected Behavior

I should not have to strace the atmos command to find out why my Git config which worked with Terraform does not work anymore with Atmos vendoring :-) It would be nice if it just worked as expected (like for Terraform) or the behaviour (ie. that a subdirectory-based module will be cloned to a subdirectory below $TMPDIR) was documented.

Steps to Reproduce

See above (there is probably a more minimalistic reproducer possible). Some other Git features (badly written hooks?) than the one described might be affected, too.

Screenshots

No response

Environment

  • OS: Linux
  • Version: 1.77.0

Additional Context

This is caused by an undocumented behaviour of go-getter (cf. hashicorp/go-getter#493) to pull an URL which refers to a subdirectory to $TMPDIR first and then copy over the wanted contents. So the $GIT_DIR does not match the directory from the includeIf because the code is actually checked out to a temporary location like /tmp/getter12345/temp.

It does work in Terraform because they resolve the double-slash syntax themselves as pointed out in this comment (I linked to the OpenTofu source due to the current Terraform license but the code is the same).

@mss mss added the bug 🐛 An issue with the system label Jun 3, 2024
@aknysh
Copy link
Member

aknysh commented Jun 3, 2024

@mss thanks for testing it.

Atmos uses the go-getter lib https://github.com/hashicorp/go-getter to download from different sources.

Can you please test the URL
ssh://[email protected]/tf/tf-modules.git//modules/account-vpc?ref=v0.1.0

instead of
git::ssh://[email protected]/tf/tf-modules.git//modules/account-vpc?ref=v0.1.0

Another problem could be that it uses both ssh and //modules, and only in this case it does not respect .gitconfig settings.

Can you please test a few things:

  • The URL ssh://[email protected]/tf/tf-modules.git//modules/account-vpc?ref=v0.1.0
  • Any other ssh URL that does not use //modules

and let us know the results.

Thank you

@osterman
Copy link
Member

osterman commented Jun 3, 2024

Sorry, I see you point to this now in your "Context" section.

This relates to:

There is a notable exception when it comes to SSH settings in .gitconfig. Configurations related to SSH, such as URL replacements (insteadOf), do not work as expected with go-getter. This is because go-getter checks out the repository into a temporary directory first, which can bypass certain conditional Git configurations.

@osterman
Copy link
Member

osterman commented Jun 3, 2024

@aknysh sounds like the fix is pointed to here:

It does work in Terraform because they resolve the double-slash syntax themselves as pointed out in this comment (I linked to the OpenTofu source due to the current Terraform license but the code is the same).

We should be able to implement the same fix in atmos.

@aknysh
Copy link
Member

aknysh commented Jun 3, 2024

@mss I see your latest opened issue here hashicorp/go-getter#493

If you have tested what I asked above ("Any other ssh URL that does not use //modules"), does it means that go-getter respects .gitconfig for the root of the repo, but does not respect it when using modules (b/c the $GIT_DIR does not match the directory from the includeIf)?

Thanks again for all the testing. Your help is appreciated, it will allow us to understand the root of the problem and fix it in Atmos

@mss
Copy link
Author

mss commented Jun 3, 2024

Wow, that's some quick responses :-)

Ok, maybe some additional info: We did not check if URLs without // work but instead verified that .gitconfig is not generally ignored by setting $TMPDIR to a directory within the work dir. Ie. this worked as expected: mkdir -p $HOME/code/work/tmp && env TMPDIR=$HOME/code/work/tmp atmos vendor pull.

I now tried to use an URL without the double-slash and that does not work at all, even without the magic in .gitconfig.

Here is my use case: The first component is vendored properly but the second isn't (I put both into a single vendor file; the behaviour is the same when I use two files)

apiVersion: atmos/v1
kind: AtmosVendorConfig
metadata:
  name: account-vpc
  description: account components
spec:
  sources:
    - component: 'aws-vpc-endpoints-v5.8.1'
      source: 'git::https://github.com/terraform-aws-modules/terraform-aws-vpc.git//modules/vpc-endpoints?ref=v5.8.1'
      targets:
        - 'components/terraform/aws-vpc-endpoints/v5.8.1'
    - component: 'aws-vpc-v5.8.1'
      source: 'git::https://github.com/terraform-aws-modules/terraform-aws-vpc.git?ref=v5.8.1'
      targets:
        - 'components/terraform/aws-vpc/v5.8.1'

The weird thing is that if I strace the process then no git clone is called at all for the second component (same when using two vendor files). But I am tired so there is probably a typo in my tests.

Here is the `strace` output
# rm -rf components/terraform && strace -f -e execve,chdir,mkdirat -e signal=none -e quiet=all atmos vendor pull
execve("/home/xxx/.local/bin/atmos", ["atmos", "vendor", "pull"], 0x7ffe829acba8 /* 75 vars */) = 0
[pid 50780] execve("/home/xxx/.tenv/Atmos/1.77.0/atmos", ["/home/xxx/.tenv/Atmos/1.77.0/"..., "vendor", "pull"], 0xc000098000 /* 75 vars */) = 0
Processing vendor config file 'vendor.yaml'

Pulling sources for the component 'aws-vpc-endpoints-v5.8.1' from 'git::https://github.com/terraform-aws-modules/terraform-aws-vpc.git//modules/vpc-endpoints?ref=v5.8.1' into 'components/terraform/aws-vpc/v5.8.1'
[pid 50784] mkdirat(AT_FDCWD, "/tmp/17174332051511647273", 0700) = 0
[pid 50784] mkdirat(AT_FDCWD, "/tmp/getter3662728735", 0700) = 0
[pid 50789] execve("/usr/bin/git", ["git", "clone", "--", "https://github.com/terraform-aws"..., "/tmp/getter3662728735/temp"], 0xc00051e000 /* 75 vars */) = 0
[pid 50790] execve("/usr/lib/git-core/git", ["/usr/lib/git-core/git", "remote-https", "origin", "https://github.com/terraform-aws"...], 0x618ee4fce650 /* 77 vars */) = 0
[pid 50791] execve("/usr/lib/git-core/git-remote-https", ["/usr/lib/git-core/git-remote-htt"..., "origin", "https://github.com/terraform-aws"...], 0x559ec46b8040 /* 77 vars */) = 0
[pid 50794] execve("/usr/lib/git-core/git", ["/usr/lib/git-core/git", "index-pack", "--stdin", "--fix-thin", "--keep=fetch-pack 50789 on slpn-"..., "--check-self-contained-and-conne"...], 0x618ee4fce650 /* 77 vars */) = 0
[pid 50798] execve("/usr/lib/git-core/git", ["/usr/lib/git-core/git", "rev-list", "--objects", "--stdin", "--not", "--all", "--quiet", "--alternate-refs"], 0x618ee4fce650 /* 77 vars */) = 0
[pid 50789] chdir("/tmp/getter3662728735/temp") = 0
[pid 50799] chdir("/tmp/getter3662728735/temp") = 0
[pid 50799] execve("/usr/bin/git", ["git", "checkout", "v5.8.1"], 0xc000706500 /* 75 vars */) = 0
[pid 50799] chdir("/tmp/getter3662728735/temp") = 0
[pid 50800] chdir("/tmp/getter3662728735/temp") = 0
[pid 50800] execve("/usr/bin/git", ["git", "checkout", "v5.8.1"], 0xc00051e280 /* 75 vars */) = 0
[pid 50800] chdir("/tmp/getter3662728735/temp") = 0
[pid 50801] chdir("/tmp/getter3662728735/temp") = 0
[pid 50801] execve("/usr/bin/git", ["git", "submodule", "update", "--init", "--recursive"], 0xc000707180 /* 75 vars */) = 0
[pid 50802] execve("/usr/lib/git-core/git-submodule", ["/usr/lib/git-core/git-submodule", "update", "--init", "--recursive"], 0x56e2a2b1c2a0 /* 76 vars */) = 0
[pid 50804] execve("/usr/bin/basename", ["basename", "/usr/lib/git-core/git-submodule"], 0x5c6332277518 /* 76 vars */) = 0
[pid 50805] execve("/usr/bin/sed", ["sed", "-e", "s/-/ /"], 0x5c6332277548 /* 76 vars */) = 0
[pid 50806] execve("/usr/lib/git-core/git", ["git", "--exec-path"], 0x5c6332278498 /* 76 vars */) = 0
[pid 50810] execve("/usr/bin/basename", ["basename", "--", "/usr/lib/git-core/git-submodule"], 0x5c633227b218 /* 79 vars */) = 0
[pid 50811] execve("/usr/bin/sed", ["sed", "-e", "s/-/ /"], 0x5c633227b218 /* 79 vars */) = 0
[pid 50813] execve("/usr/bin/gettext", ["gettext", "usage: $dashless $USAGE"], 0x5c633227b5b8 /* 79 vars */) = 0
[pid 50815] execve("/usr/bin/envsubst", ["envsubst", "--variables", "usage: $dashless $USAGE"], 0x5c633227b5d8 /* 79 vars */) = 0
[pid 50814] execve("/usr/bin/envsubst", ["envsubst", "usage: $dashless $USAGE"], 0x5c633227b5d8 /* 81 vars */) = 0
[pid 50816] execve("/usr/bin/uname", ["uname", "-s"], 0x5c6332278498 /* 79 vars */) = 0
[pid 50817] execve("/usr/lib/git-core/git", ["git", "rev-parse", "--git-dir"], 0x5c6332278498 /* 79 vars */) = 0
[pid 50818] chdir("/tmp/getter3662728735/temp/.git") = 0
[pid 50819] execve("/usr/lib/git-core/git", ["git", "rev-parse", "--git-path", "objects"], 0x5c6332278498 /* 79 vars */) = 0
[pid 50820] execve("/usr/lib/git-core/git", ["git", "rev-parse", "--is-inside-work-tree"], 0x5c6332278498 /* 79 vars */) = 0
[pid 50821] execve("/usr/lib/git-core/git", ["git", "rev-parse", "--show-prefix"], 0x5c6332278498 /* 79 vars */) = 0
[pid 50822] execve("/usr/lib/git-core/git", ["git", "rev-parse", "--show-toplevel"], 0x5c6332278498 /* 79 vars */) = 0
[pid 50802] chdir("/tmp/getter3662728735/temp") = 0
[pid 50825] execve("/usr/bin/sed", ["sed", "-e", "s/-/_/g"], 0x5c633227ae08 /* 80 vars */) = 0
[pid 50826] execve("/usr/lib/git-core/git", ["git", "submodule--helper", "init", "--"], 0x5c63322879f8 /* 80 vars */) = 0
[pid 50829] execve("/usr/lib/git-core/git", ["git", "submodule--helper", "update-clone", "--"], 0x5c6332287df8 /* 80 vars */) = 0
[pid 50784] mkdirat(AT_FDCWD, "/tmp/17174332051511647273", 0755) = 0
[pid 50784] mkdirat(AT_FDCWD, "components/terraform", 0755) = 0
[pid 50784] mkdirat(AT_FDCWD, "components/terraform/aws-vpc", 0755) = 0
[pid 50784] mkdirat(AT_FDCWD, "components/terraform/aws-vpc/v5.8.1", 0755) = 0

Pulling sources for the component 'aws-vpc-v5.8.1' from 'git::https://github.com/terraform-aws-modules/terraform-aws-vpc.git?ref=v5.8.1' into 'components/terraform/aws-vpc-endpoints/v5.8.1'
[pid 50784] mkdirat(AT_FDCWD, "/tmp/17174332073449747459", 0700) = 0
[pid 50830] chdir("/tmp/17174332073449747459") = 0
[pid 50830] execve("/usr/bin/git", ["git", "show-ref", "-q", "--verify", "refs/heads/v5.8.1"], 0xc00051ef00 /* 75 vars */) = 0
[pid 50831] chdir("/tmp/17174332073449747459") = 0
[pid 50831] execve("/usr/bin/git", ["git", "branch", "-r", "--points-at", "refs/remotes/origin/HEAD"], 0xc000707b80 /* 75 vars */) = 0
[pid 50832] chdir("/tmp/17174332073449747459") = 0
[pid 50832] execve("/usr/bin/git", ["git", "checkout", "master"], 0xc00051f180 /* 75 vars */) = 0
error downloading 'https://github.com/terraform-aws-modules/terraform-aws-vpc.git?ref=v5.8.1': /usr/bin/git exited with 128: fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

One thing I also noticed in the strace output: Something first creates a /tmp/17174332073449747459 directory and tries to use that. For both cases. Is that atmos doing its thing there? (You can see the go-getter /tmp/getter3662728735/temp right after that in the first usecase.)

@aknysh
Copy link
Member

aknysh commented Jun 4, 2024

@mss thanks.

The second issue is different from the first one :)

Anyway, I see both issue now. The second one is b/c of a combination of how go-getter works and the fact that Atmos always created tmp directories b/c it later processes the Included_paths and excluded_paths attributes.

The fixes will be in a new Atmos release.

Thanks again for all the testing

@aknysh
Copy link
Member

aknysh commented Jun 5, 2024

@mss please try this latest release https://github.com/cloudposse/atmos/releases/tag/v1.78.0

It fixes the second issue you raised ("I now tried to use an URL without the double-slash").

Regarding the first issue (go-getter not respecting .gitconfig), let me know if it's still an issue for you (if mkdir -p $HOME/code/work/tmp && env TMPDIR=$HOME/code/work/tmp atmos vendor pull is not what you want to use), and we'll look into what could be done here.

Thanks again

@verygitmuchhub
Copy link

Hey there. I have these exact issues. For the second issue (no //..). With 1.78.0 I now have the same behaviour for both double-slash and no-double-slash URLs. That's good.

I still have the behaviour that I have to set TMPDIR ([includeIf "gitdir:~/...]).

@aknysh
Copy link
Member

aknysh commented Jun 7, 2024

thanks @verygitmuchhub

We'll have to look if we can set TMPDIR in Atmos to the temp directory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 An issue with the system
Projects
None yet
Development

No branches or pull requests

5 participants