Skip to content

perf: do allow GIT_PERF_* to be overridden again #1900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dscho
Copy link
Member

@dscho dscho commented Apr 4, 2025

This issue was noticed when working on large-scale issues.

Cc: Patrick Steinhardt [email protected]
cc: Derrick Stolee [email protected]
cc: Jeff King [email protected]

A common way to run Git's performance benchmarks on repositories other
than Git's own repository (which is not exactly large when compared to
actually large repositories) is to run them like this:

	GIT_PERF_LARGE_REPO=/path/to/my/large/repo \
	./p1234-*.sh -ivx

Contrary to developers' common expectations, this failed to work when
Git was built with a different `GIT_PERF_LARGE_REPO` value specified at
build time: That build-time option would have been written to the
`GIT-BUILD-OPTIONS` file, which in turn would have been sourced by
`test-lib.sh`, which in turn would have been sourced by `perf-lib.sh`,
which in turn would have been sourced by the perf test script,
_overriding_ the environment variable specified in the way illustrated
above.

Since perf tests are not run as part of the build, this most likely
unintended behavior was not caught and certainly not fixed, as the
`GIT_PERF_*` values would have been empty at build-time.

However, in 4638e88 (Makefile: use common template for
GIT-BUILD-OPTIONS, 2024-12-06), a subtle change of behavior was
introduced: Whereas before, a couple of build-time options (the
`GIT_PERF_*` ones included) were written to `GIT-BUILD-OPTIONS` only
when their values were non-empty. With this commit, they are also
written when they are empty.

The consequence is that above-mentioned way to run the perf tests will
not only fail to pick up the desired `GIT_PERF_*` settings when they
were specified differently while building Git, instead the desired
settings will be only respected when specified _while building_ Git.

Let's work around the original issue, i.e. let `GIT_PERF_*` environment
variables override what is recorded in `GIT-BUILD-OPTIONS`.

Note that this is just the tip of the iceberg, there are a couple of
`GIT_TEST_*` options that may want a similar fix in `test-lib.sh`. Due
to time constraints on my side, this here patch focuses exclusively on
the `GIT_PERF_*` settings.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho dscho self-assigned this Apr 4, 2025
@dscho
Copy link
Member Author

dscho commented Apr 4, 2025

/submit

Copy link

gitgitgadget bot commented Apr 4, 2025

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1900/dscho/support-ad-hoc-git-perf-settings-again-v1

To fetch this version to local tag pr-1900/dscho/support-ad-hoc-git-perf-settings-again-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1900/dscho/support-ad-hoc-git-perf-settings-again-v1

Copy link

gitgitgadget bot commented Apr 4, 2025

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 4/4/2025 6:56 AM, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <[email protected]>
> 
> A common way to run Git's performance benchmarks on repositories other
> than Git's own repository (which is not exactly large when compared to
> actually large repositories) is to run them like this:
> 
> 	GIT_PERF_LARGE_REPO=/path/to/my/large/repo \
> 	./p1234-*.sh -ivx
> 

This issue also extends to other necessary variables such as
GIT_PERF_REPEAT_COUNT.  
> +# GIT-BUILD-OPTIONS, sourced by test-lib.sh, overwrites the `GIT_PERF_*`
> +# values that are set by the user (if any). Let's stash them away as
> +# `eval`-able assignments.
> +git_perf_settings="$(env |
> +	sed -n "/^GIT_PERF_/{
> +		# escape all single-quotes in the value
> +		s/'/'\\\\''/g
> +		# turn this into an eval-able assignment
> +		s/^\\([^=]*=\\)\\(.*\\)/\\1'\\2'/p
> +	}")"
> +
>  . ../test-lib.sh
> +eval "$git_perf_settings"
I verified this fix in my local environment. Thanks so much for digging
in and finding the solution here!

-Stolee

Copy link

gitgitgadget bot commented Apr 4, 2025

User Derrick Stolee <[email protected]> has been added to the cc: list.

Copy link

gitgitgadget bot commented Apr 19, 2025

On the Git mailing list, Jeff King wrote (reply to this):

On Fri, Apr 04, 2025 at 10:56:07AM +0000, Johannes Schindelin via GitGitGadget wrote:

> However, in 4638e8806e3a (Makefile: use common template for
> GIT-BUILD-OPTIONS, 2024-12-06), a subtle change of behavior was
> introduced: Whereas before, a couple of build-time options (the
> `GIT_PERF_*` ones included) were written to `GIT-BUILD-OPTIONS` only
> when their values were non-empty. With this commit, they are also
> written when they are empty.

It doesn't look like Junio picked this up, so I wanted to chime in that
this regression bit me today, too (specifically for GIT_PERF_LARGE_REPO,
but also another variable which I'll detail in a moment).

This is also the same issue that hit the interop suite discussed in:

  https://lore.kernel.org/git/[email protected]/

(there I was a bit dismissive of it, because I think GIT_*_MAKE_OPTS
would usually be set at build time, but for these other variables,
they're almost always going to come from the environment).

> Let's work around the original issue, i.e. let `GIT_PERF_*` environment
> variables override what is recorded in `GIT-BUILD-OPTIONS`.

I like this direction. It not only fixes the regression, but makes
things generally behave more as I'd expect them to.

I think it doesn't quite fix everything, though. I noticed that
GIT_PERF_REPEAT_COUNT no longer correctly defaults to "3" when using the
"run" script. And there are two problems here:

  1. The "run" script itself sources GIT-BUILD-OPTIONS, so it would need
     similar treatment.

  2. But even if we did, that, in my case I am not setting
     PERF_REPEAT_COUNT at all. So there is no local env variable that
     we'd use to take precedence. The problem is in the way the "run"
     script assigns defaults. If it sees an environment variable set
     (whether actually from the environment or from GIT-BUILD-OPTIONS)
     it accepts it as-is, rather than installing its fallback. So it
     will never use its default of "3", and instead retain the blank
     string (which, by a stroke of luck, does still cause it to run each
     trial at least once).

So I think we either need to rewrite the "run" script's fallback code,
or teach the GIT-BUILD-OPTIONS writer to avoid mentioning unset
variables (which is the real source of the problem in 4638e8806e3a).

So...

> Note that this is just the tip of the iceberg, there are a couple of
> `GIT_TEST_*` options that may want a similar fix in `test-lib.sh`. Due
> to time constraints on my side, this here patch focuses exclusively on
> the `GIT_PERF_*` settings.

...yes, this is definitely the tip of the iceberg. I don't mind doing
this patch as an incremental step forward (and because it is an
improvement in behavior even if 4638e8806e3a were reverted). But the
issue is far from solved overall.

> +# GIT-BUILD-OPTIONS, sourced by test-lib.sh, overwrites the `GIT_PERF_*`
> +# values that are set by the user (if any). Let's stash them away as
> +# `eval`-able assignments.
> +git_perf_settings="$(env |
> +	sed -n "/^GIT_PERF_/{
> +		# escape all single-quotes in the value
> +		s/'/'\\\\''/g
> +		# turn this into an eval-able assignment
> +		s/^\\([^=]*=\\)\\(.*\\)/\\1'\\2'/p
> +	}")"

The implementation here looks correct to me, including the quoting. The
number of backslashes is a bit vomit-inducing, but I think unavoidable
(we could put the sed command into single-quotes, but then you'd have to
escape out of it to show the single-quotes you do need to mention).

-Peff

Copy link

gitgitgadget bot commented Apr 19, 2025

User Jeff King <[email protected]> has been added to the cc: list.

Copy link

gitgitgadget bot commented Apr 20, 2025

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff King <[email protected]> writes:

> On Fri, Apr 04, 2025 at 10:56:07AM +0000, Johannes Schindelin via GitGitGadget wrote:
>
>> However, in 4638e8806e3a (Makefile: use common template for
>> GIT-BUILD-OPTIONS, 2024-12-06), a subtle change of behavior was
>> introduced: Whereas before, a couple of build-time options (the
>> `GIT_PERF_*` ones included) were written to `GIT-BUILD-OPTIONS` only
>> when their values were non-empty. With this commit, they are also
>> written when they are empty.
>
> It doesn't look like Junio picked this up, so I wanted to chime in that
> this regression bit me today, too (specifically for GIT_PERF_LARGE_REPO,
> but also another variable which I'll detail in a moment).

This was lost in the cracks.  Thanks for bringing it back to our
attention.  I think what happened was that I saw whack-a-mole aspect
of the root cause, which makes this "the tip of the iceberg", and
felt it was more sensible to wait before a real solution, like ...

> So I think we either need to rewrite the "run" script's fallback code,
> or teach the GIT-BUILD-OPTIONS writer to avoid mentioning unset
> variables (which is the real source of the problem in 4638e8806e3a).

... this was raised.  And then I completely forgot about the topic,
as nothing happened since then.

> ...yes, this is definitely the tip of the iceberg. I don't mind doing
> this patch as an incremental step forward (and because it is an
> improvement in behavior even if 4638e8806e3a were reverted). But the
> issue is far from solved overall.

I do not mind it as an incremental band-aid.

Thanks, all.

Copy link

gitgitgadget bot commented Apr 20, 2025

This patch series was integrated into seen via git@d2bb7b0.

@gitgitgadget gitgitgadget bot added the seen label Apr 20, 2025
Copy link

gitgitgadget bot commented Apr 22, 2025

This branch is now known as js/git-perf-env-override.

Copy link

gitgitgadget bot commented Apr 22, 2025

This patch series was integrated into seen via git@0dc08fb.

Copy link

gitgitgadget bot commented Apr 22, 2025

On the Git mailing list, Jeff King wrote (reply to this):

On Sun, Apr 20, 2025 at 02:12:34PM -0700, Junio C Hamano wrote:

> > So I think we either need to rewrite the "run" script's fallback code,
> > or teach the GIT-BUILD-OPTIONS writer to avoid mentioning unset
> > variables (which is the real source of the problem in 4638e8806e3a).
> 
> ... this was raised.  And then I completely forgot about the topic,
> as nothing happened since then.
> 
> > ...yes, this is definitely the tip of the iceberg. I don't mind doing
> > this patch as an incremental step forward (and because it is an
> > improvement in behavior even if 4638e8806e3a were reverted). But the
> > issue is far from solved overall.
> 
> I do not mind it as an incremental band-aid.

Looks like this is in 'jch' now, but there's a mis-merge in e8cf2b99cd
(Merge branch 'ps/meson-build-perf-bench' into jch, 2025-04-21).

The original patch in this thread did something like:

  git_perf_settings=$(...pull GIT_PERF_* from env...)
   . ../test-lib.sh
  eval "$git_perf_settings"

That is, we stash away the environment, then load test-lib.sh, which
overwrites the environment, and then we restore (some of) the original
values.

In that merge, the test-lib.sh inclusion is moved (and in fact is now
accompanied by an explicit inclusion of GIT-BUILD-OPTIONS), and we now
have:

   . "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS
   . "$GIT_SOURCE_DIR"/t/test-lib.sh
   git_perf_settings=$(...)
   eval "$git_perf_settings"

Which of course does nothing. We need to set $git_perf_settings before
those other source lines (and the eval must remain after them).

-Peff

Copy link

gitgitgadget bot commented Apr 22, 2025

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff King <[email protected]> writes:

> The original patch in this thread did something like:
>
>   git_perf_settings=$(...pull GIT_PERF_* from env...)
>    . ../test-lib.sh
>   eval "$git_perf_settings"
>
> That is, we stash away the environment, then load test-lib.sh, which
> overwrites the environment, and then we restore (some of) the original
> values.
>
> In that merge, the test-lib.sh inclusion is moved (and in fact is now
> accompanied by an explicit inclusion of GIT-BUILD-OPTIONS), and we now
> have:
>
>    . "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS
>    . "$GIT_SOURCE_DIR"/t/test-lib.sh
>    git_perf_settings=$(...)
>    eval "$git_perf_settings"
>
> Which of course does nothing. We need to set $git_perf_settings before
> those other source lines (and the eval must remain after them).

Right.  Thanks.

Copy link

gitgitgadget bot commented Apr 22, 2025

This patch series was integrated into seen via git@6deccf7.

Copy link

gitgitgadget bot commented Apr 22, 2025

This patch series was integrated into next via git@77ea361.

@gitgitgadget gitgitgadget bot added the next label Apr 22, 2025
Copy link

gitgitgadget bot commented Apr 23, 2025

There was a status update in the "New Topics" section about the branch js/git-perf-env-override on the Git mailing list:

Developer support fix..

Will merge to 'master'.
source: <[email protected]>

Copy link

gitgitgadget bot commented Apr 23, 2025

This patch series was integrated into seen via git@15a9198.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant