Skip to content

Improve support for multiple blame ranges #1976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

holodorum
Copy link
Contributor

This PR implements various improvements suggested in PR #1973.

The main change is converting the BlameRanges struct, into an enum to support both OneBasedInclusive and ZeroBasedExclusive range formats. The FullFile variant, denotes that the entire file is to be blamed, serves as a more explicit substitute for a previously used "empty" range.

@holodorum holodorum force-pushed the feature/blame-ranges-update branch from 2201fa2 to 249162a Compare May 1, 2025 06:37
@Byron
Copy link
Member

Byron commented May 1, 2025

Thanks a lot for the follow-up!

@cruessler would probably be the one to do the first review, and I will get it merged right after.
Thanks everyone

@cruessler
Copy link
Contributor

Thanks a lot for the follow-up!

Before I start reviewing this PR in detail, I have a few questions regarding the high-level API. In particular, it seems as if the decision to change to_zero_based_exclusive(&self, max_lines: u32) -> Result<Vec<Range<u32>>, Error> to to_zero_based_exclusive(&self, max_lines: u32) -> Result<BlameRanges, Error> brings a couple of downsides with it. My main concern is that it leaks implementation details: the caller now has to know about BlameRanges’ internals while that was previously not the case. Also, two of BlameRanges’ methods are now fallible, forcing the caller to deal with errors that are coupled to the struct’s internals.

I think the proposed API can be simplified and stay more insulated from calling code by reducing the number of enum variants to WholeFile and PartialFile (or something more appropriately named) and staying with to_zero_based_exclusive(&self, max_lines: u32) -> Result<Vec<Range<u32>>, Error>. We could even go so far as to stay with the existing design that treats an empty ranges as covering the whole file.

What do you think?

@holodorum holodorum force-pushed the feature/blame-ranges-update branch from 249162a to e1db1b7 Compare May 13, 2025 17:35
@holodorum
Copy link
Contributor Author

As discussed with @cruessler during a call, we decided to make the API more robust and less error-prone.

We've removed the distinction between BlameRanges:OneBasedInclusive and BlameRanges:ZeroBasedExclusive. Instead, we now only support two variants: PartialFile and WholeFile. Internally, BlameRanges always uses zero-based, exclusive ranges.

This means users no longer need to worry about converting between one-based and zero-based ranges. If a non-inclusive range is used during construction, an error will be thrown, helping to prevent subtle bugs.

holodorum added 2 commits May 13, 2025 19:49
This modification introduces changes to the `BlameRanges` struct, converting it into an enum to support both `PartialFile` and `WholeFile`. Internally the range in `BlameRanges` is stored as zero-based-exclusive now.
@holodorum holodorum force-pushed the feature/blame-ranges-update branch from e1db1b7 to 8484098 Compare May 13, 2025 17:49
@holodorum
Copy link
Contributor Author

Bit surprised about the CI-error. Any hint for that?

@Byron
Copy link
Member

Byron commented May 13, 2025

I restarted the job, and I'd expect it to go through. There is some known flakiness with tests that deal with concurrent IO, and even though it's quite rare, it happens, unfortunately.

@EliahKagan
Copy link
Member

EliahKagan commented May 14, 2025

The original failure in the test-fast job on macos-latest was due to #1816. That issue is the only current source of regular flakiness that I am aware of. I have not been keeping track of most occurrences of it, but lately I have observed such a failure every couple of days. (The other source of flakiness I am aware of is #2006, but that is not a regular source of flakiness--it seems to happen once or twice a year when not deliberately induced, and in any case it is not what happened here. If there are more known sources of CI flakiness, then I would be interested to learn of them, with the hope of helping out with them as well.)

Because the test-fast jobs are defined by a matrix with an implicit fail-fast: true strategy (i.e., fail-fast: false is not specified), any test-fast job failure will tell whatever other test-fast jobs are still running to cancel. It looks like you reran only the failed macos-latest job. This automatically also reruns the dependent tests-pass job, but not the sibling jobs that had been canceled. So although the macos-latest job passed when rerun, but the three canceled jobs did not rerun. tests-pass saw those jobs still had canceled status, and reported failure again.

I've rerun the workflow as a whole, and all jobs passed. From context, it looks like the failing tests might have been all that were still blocking this PR form being merged. However, since auto-merge wasn't enabled here and I haven't been following this PR closely, I have refrained from merging it, in case my understanding is not correct.

@Byron
Copy link
Member

Byron commented May 14, 2025

The original failure in the test-fast job on macos-latest was due to #1816.

Thanks for pointing this out! I remember now that the filesystem probe can have collisions across processes and produce incorrect values due to a race.

Thanks also for rerunning CI properly, I will keep that in mind.

[..] since auto-merge wasn't enabled here and I haven't been following this PR closely, I have refrained from merging it, in case my understanding is not correct.

Thank you, that's the right call. The plan here is that @cruessler will do the first review, and I take a look once he approves for merging.

@EliahKagan
Copy link
Member

Thanks also for rerunning CI properly, I will keep that in mind.

The way I reran it was arguably overkill--when rerunning a whole workflow, one can select "Re-run failed jobs" instead of "Re-run all jobs", which in spite of its name, I think does also rerun canceled jobs from the same matrix as a failed job that were canceled because of it.

("Re-run all jobs" and "Re-run failed jobs" are available at the workflow level, and should not be confused with re-running a single job, which doesn't automatically re-run failed or canceled sibling jobs.)

Thank you, that's the right call.

Thanks--I wasn't sure if the requested review was covered by #1976 (comment) or not. I'm glad I held off from merging.

@Byron
Copy link
Member

Byron commented May 14, 2025

The way I reran it was arguably overkill--when rerunning a whole workflow, one can select "Re-run failed jobs" instead of "Re-run all jobs", which in spite of its name, I think does also rerun canceled jobs from the same matrix as a failed job that were canceled because of it.

That's a great pointer - I definitely thought "Re-run failed jobs" does the trick even for cancelled ones, which means I must have hit a rerun button on the level of the individual failed job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants