Construct testdata at build time rather than cloning a repo #104

shs96c · 2025-02-28T17:03:17Z

It turns out that making minor changes in the test data is prohibitively complicated since the tests very much rely on the git shasums, and any change will need to be reflected here. We also can't rebase the changes since that would break historical builds.

Rather than rely on an external repo, instead, we now create the data within our own build. The individual states we want to use are modelled as separate directories within the integration test's testdata dir, and each state is given a meaningful name.

It turns out that making minor changes in the test data is prohibitively complicated since the tests very much rely on the git shasums, and any change will need to be reflected here. We also can't rebase the changes since that would break historical builds. Rather than rely on an external repo, instead, we now create the data within our own build. The individual states we want to use are modelled as separate directories within the integration test's `testdata` dir, and each state is given a meaningful name.

illicitonion

Totally understand and agree with the problem (and sorry for not having automated the repo creation in the first place, it would've been polite/useful!)

I do think there are a couple of draw-backs to this approach, but I think with a pretty small amount of extra work we can get the best of both worlds?

The two big drawbacks to me are it makes it hard to verify the tests are actually testing what we think (if there are bugs in the generation, we'll never notice, and it makes it hard to manually run a test (e.g. right now I can just clone the test repo and run a manually built TD against it).

But given you're already generating git repos here, my ideal would be that we make a main function we can run which just generates a bunch of tags in a git repo with the states we care about (maybe named after the directory names + date of timestamp of generation or something), and generates the Commits class that contains the constants with the commit shas.

Which I think is roughly a trivial amount of work to do on top of what you're already doing (just "save and push" the commits, rather than "generate and discard them"), but gives us the best of both worlds?

WDYT?

shs96c · 2025-03-07T18:21:49Z

Would you like to have a go at this once this PR lands? Or send a patch on top of this PR?

illicitonion · 2025-03-10T14:55:40Z

Sure, I can give it a go, but likely won't be for at least a few days :)

sitaktif · 2025-04-29T13:23:08Z

@illicitonion do you suggest we keep a separate testdata repo, or is the idea that the generator creates the git repo ~on the fly before testing and passes it as test data somehow?

This canonicalize all the labels in all attributes found that are label related using the output of `bazel mod dump_repo_mapping ""` for each commit. This addresses #105 It doesn't have any integration tests yet because it would be much better to leverage #104 which is still in the works but it was tested against our internal monorepo and the reproducer in #105 Label like attributes (string defined but has nodep = True) are special and handled as an exception as if they were labels, and thus also converted. (See [this thread](https://bazelbuild.slack.com/archives/CDCMRLS23/p1742821059464199))

shs96c marked this pull request as ready for review March 7, 2025 12:51

shs96c added 6 commits March 7, 2025 15:12

cp: here we go again

7c7a71d

cp: hackity hack

e6a4641

cp: progress

d75441b

First submodule test working

01f2b33

Remove the comparison tests

7ff8db4

shs96c force-pushed the build-test-data branch from 01b0508 to 7ff8db4 Compare March 7, 2025 15:26

Attempt to delete everything to restore some disk space

003205c

shs96c force-pushed the build-test-data branch from b50c9ee to 003205c Compare March 7, 2025 16:56

illicitonion reviewed Mar 7, 2025

View reviewed changes

darkrift mentioned this pull request Mar 24, 2025

Canonicalize targets #108

Merged

sitaktif mentioned this pull request May 1, 2025

go.mod/MODULE.bazel update from indirect to direct marks other go_* targets as changed #105

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Construct testdata at build time rather than cloning a repo #104

Construct testdata at build time rather than cloning a repo #104

Uh oh!

shs96c commented Feb 28, 2025

Uh oh!

illicitonion left a comment

Uh oh!

shs96c commented Mar 7, 2025

Uh oh!

illicitonion commented Mar 10, 2025

Uh oh!

sitaktif commented Apr 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Construct testdata at build time rather than cloning a repo #104

Are you sure you want to change the base?

Construct testdata at build time rather than cloning a repo #104

Uh oh!

Conversation

shs96c commented Feb 28, 2025

Uh oh!

illicitonion left a comment

Choose a reason for hiding this comment

Uh oh!

shs96c commented Mar 7, 2025

Uh oh!

illicitonion commented Mar 10, 2025

Uh oh!

sitaktif commented Apr 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants