Improve support for multiple blame ranges #1976

holodorum · 2025-04-30T08:53:02Z

This PR implements various improvements suggested in PR #1973.

The main change is converting the BlameRanges struct, into an enum to support both OneBasedInclusive and ZeroBasedExclusive range formats. The FullFile variant, denotes that the entire file is to be blamed, serves as a more explicit substitute for a previously used "empty" range.

Byron · 2025-05-01T20:34:23Z

Thanks a lot for the follow-up!

@cruessler would probably be the one to do the first review, and I will get it merged right after.
Thanks everyone

cruessler · 2025-05-02T12:03:06Z

Thanks a lot for the follow-up!

Before I start reviewing this PR in detail, I have a few questions regarding the high-level API. In particular, it seems as if the decision to change to_zero_based_exclusive(&self, max_lines: u32) -> Result<Vec<Range<u32>>, Error> to to_zero_based_exclusive(&self, max_lines: u32) -> Result<BlameRanges, Error> brings a couple of downsides with it. My main concern is that it leaks implementation details: the caller now has to know about BlameRanges’ internals while that was previously not the case. Also, two of BlameRanges’ methods are now fallible, forcing the caller to deal with errors that are coupled to the struct’s internals.

I think the proposed API can be simplified and stay more insulated from calling code by reducing the number of enum variants to WholeFile and PartialFile (or something more appropriately named) and staying with to_zero_based_exclusive(&self, max_lines: u32) -> Result<Vec<Range<u32>>, Error>. We could even go so far as to stay with the existing design that treats an empty ranges as covering the whole file.

What do you think?

holodorum · 2025-05-13T17:45:10Z

As discussed with @cruessler during a call, we decided to make the API more robust and less error-prone.

We've removed the distinction between BlameRanges:OneBasedInclusive and BlameRanges:ZeroBasedExclusive. Instead, we now only support two variants: PartialFile and WholeFile. Internally, BlameRanges always uses zero-based, exclusive ranges.

This means users no longer need to worry about converting between one-based and zero-based ranges. If a non-inclusive range is used during construction, an error will be thrown, helping to prevent subtle bugs.

This modification introduces changes to the `BlameRanges` struct, converting it into an enum to support both `PartialFile` and `WholeFile`. Internally the range in `BlameRanges` is stored as zero-based-exclusive now.

holodorum · 2025-05-13T18:57:50Z

Bit surprised about the CI-error. Any hint for that?

Byron · 2025-05-13T20:31:38Z

I restarted the job, and I'd expect it to go through. There is some known flakiness with tests that deal with concurrent IO, and even though it's quite rare, it happens, unfortunately.

EliahKagan · 2025-05-14T04:04:41Z

The original failure in the test-fast job on macos-latest was due to #1816. That issue is the only current source of regular flakiness that I am aware of. I have not been keeping track of most occurrences of it, but lately I have observed such a failure every couple of days. (The other source of flakiness I am aware of is #2006, but that is not a regular source of flakiness--it seems to happen once or twice a year when not deliberately induced, and in any case it is not what happened here. If there are more known sources of CI flakiness, then I would be interested to learn of them, with the hope of helping out with them as well.)

Because the test-fast jobs are defined by a matrix with an implicit fail-fast: true strategy (i.e., fail-fast: false is not specified), any test-fast job failure will tell whatever other test-fast jobs are still running to cancel. It looks like you reran only the failed macos-latest job. This automatically also reruns the dependent tests-pass job, but not the sibling jobs that had been canceled. So although the macos-latest job passed when rerun, but the three canceled jobs did not rerun. tests-pass saw those jobs still had canceled status, and reported failure again.

I've rerun the workflow as a whole, and all jobs passed. From context, it looks like the failing tests might have been all that were still blocking this PR form being merged. However, since auto-merge wasn't enabled here and I haven't been following this PR closely, I have refrained from merging it, in case my understanding is not correct.

Byron · 2025-05-14T05:21:43Z

The original failure in the test-fast job on macos-latest was due to #1816.

Thanks for pointing this out! I remember now that the filesystem probe can have collisions across processes and produce incorrect values due to a race.

Thanks also for rerunning CI properly, I will keep that in mind.

[..] since auto-merge wasn't enabled here and I haven't been following this PR closely, I have refrained from merging it, in case my understanding is not correct.

Thank you, that's the right call. The plan here is that @cruessler will do the first review, and I take a look once he approves for merging.

EliahKagan · 2025-05-14T05:37:25Z

Thanks also for rerunning CI properly, I will keep that in mind.

The way I reran it was arguably overkill--when rerunning a whole workflow, one can select "Re-run failed jobs" instead of "Re-run all jobs", which in spite of its name, I think does also rerun canceled jobs from the same matrix as a failed job that were canceled because of it.

("Re-run all jobs" and "Re-run failed jobs" are available at the workflow level, and should not be confused with re-running a single job, which doesn't automatically re-run failed or canceled sibling jobs.)

Thank you, that's the right call.

Thanks--I wasn't sure if the requested review was covered by #1976 (comment) or not. I'm glad I held off from merging.

Byron · 2025-05-14T05:49:25Z

The way I reran it was arguably overkill--when rerunning a whole workflow, one can select "Re-run failed jobs" instead of "Re-run all jobs", which in spite of its name, I think does also rerun canceled jobs from the same matrix as a failed job that were canceled because of it.

That's a great pointer - I definitely thought "Re-run failed jobs" does the trick even for cancelled ones, which means I must have hit a rerun button on the level of the individual failed job.

cruessler · 2025-05-19T08:12:48Z

@holodorum Now that RustWeek is over, I’ll get to the review in the next couple of days!

cruessler

I finally got to the review! I think the PR is generally solid, but needs a bit of polish before it is ready to go. Let me know if you’ve got any questions!

gix-blame/src/types.rs

cruessler · 2025-05-23T15:04:36Z

gix-blame/src/types.rs

+                let zero_based_range = Self::inclusive_to_zero_based_exclusive(new_range);
+                self.merge_range(zero_based_range)
+            }
+            _ => Err(Error::InvalidOneBasedLineRange),


Instead of erroring here, I think I prefer turning self into PartialFile and adding new_range. I assume (and hope) that that is what most people would expect.

with which range would you then want to turn the self into a PartialFile? Initially the user was blaming the WholeFile and I doubt it then makes sense to then add an extra range.

My idea was to use new_range as the only range for PartialFile. My thinking was that turning self into PartialFile would make the API more ergonomic and easy to use as a user would not have to know anything about the internal state of BlameRanges to be able to call add_one_based_inclusive_range. In my view, this would mirror how git blame vs. git blame -L 1,5 -L 7,9 works. But I also think that it’s acceptable to be more explicit even if that makes the API harder to use.

(What I have in mind, seems similar to the logic in from_one_based_inclusive_ranges.)

gix-blame/src/types.rs

cruessler · 2025-05-23T15:11:15Z

gix-blame/src/types.rs

            }
+            _ => Err(Error::InvalidOneBasedLineRange),


Same as above: what about turning self into PartialFile?

gix-blame/src/types.rs

gix-blame/src/error.rs

gix-blame/src/file/tests.rs

holodorum · 2025-05-29T10:51:39Z

Thanks for the review @cruessler. I've implemented almost all your changes.
I feel that the difference between one_based_inclusive_ranges and zero_based_exclusive leaves some room for error and confusion. What do you think about introducing a custom type OneBasedInclusiveRange and ZeroBasedExclusiveRange. We could keep the function names simple, and force the correct input.

cruessler

Thanks for your response, and sorry that mine is so delayed!

What would be the benefit of introducing OneBasedInclusiveRange and ZeroBasedExclusiveRange as a type and where would it be used: inside of BlameRanges only or also as part of the public API? I see the risk of making the API more heavy than it needs to be. In my opinion, the current method names already sufficiently convey the constraints for all the publicy exposed methods.

Once my last concern has been addressed, I’d also approve this PR, so that @Byron can have a look! I plan on prioritizing this PR now in order to get it merged as quickly as possible.

cruessler · 2025-06-11T06:56:53Z

gix-blame/src/types.rs

@@ -82,16 +82,19 @@ impl BlameRanges {
            .collect::<Vec<_>>();
        let mut result = Self::PartialFile(vec![]);
        for range in zero_based_ranges {
-            let _ = result.merge_range(range);
+            let _ = result.merge_zero_based_exclusive_range(range?);


The fact that the possibility of an error is explicitly ignored here, also seems to nudge toward making merge_zero_based_exclusive_range never fail.

Fix typo

9284d07

holodorum force-pushed the feature/blame-ranges-update branch from 2201fa2 to 249162a Compare May 1, 2025 06:37

holodorum force-pushed the feature/blame-ranges-update branch from 249162a to e1db1b7 Compare May 13, 2025 17:35

holodorum added 2 commits May 13, 2025 19:49

Rename range to ranges

cbba4d1

Refactor BlameRanges to support multiple range formats as an enum

8484098

This modification introduces changes to the `BlameRanges` struct, converting it into an enum to support both `PartialFile` and `WholeFile`. Internally the range in `BlameRanges` is stored as zero-based-exclusive now.

holodorum force-pushed the feature/blame-ranges-update branch from e1db1b7 to 8484098 Compare May 13, 2025 17:49

EliahKagan mentioned this pull request May 14, 2025

Pseudorandom numbers are sometimes the same across processes #1816

Open

cruessler suggested changes May 23, 2025

View reviewed changes

Implement changes review

dad67c8

cruessler suggested changes Jun 11, 2025

View reviewed changes

Uh oh!

Improve support for multiple blame ranges #1976

Are you sure you want to change the base?

Improve support for multiple blame ranges #1976

Uh oh!

Conversation

holodorum commented Apr 30, 2025

Uh oh!

Byron commented May 1, 2025

Uh oh!

cruessler commented May 2, 2025

Uh oh!

holodorum commented May 13, 2025

Uh oh!

holodorum commented May 13, 2025

Uh oh!

Byron commented May 13, 2025

Uh oh!

EliahKagan commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Byron commented May 14, 2025

Uh oh!

EliahKagan commented May 14, 2025

Uh oh!

Byron commented May 14, 2025

Uh oh!

cruessler commented May 19, 2025

Uh oh!

cruessler left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cruessler May 23, 2025

Choose a reason for hiding this comment

Uh oh!

holodorum May 29, 2025

Choose a reason for hiding this comment

Uh oh!

cruessler Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

cruessler Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cruessler May 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

holodorum commented May 29, 2025

Uh oh!

cruessler left a comment

Choose a reason for hiding this comment

Uh oh!

cruessler Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

EliahKagan commented May 14, 2025 •

edited

Loading