-
-
Notifications
You must be signed in to change notification settings - Fork 990
stackalloc in separate method #2374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@timcassell I do not fully understand this fix and why the original code leads to #2366 (since memory randomization is disabled by default). Could you please provide additional explanations for this problem? @adamsitnik could you please take a look? |
Since the benchmark disassembled code was identical between net6 and net7, and I got sporadic measurements with memory randomization, I had a hunch that was the cause. So I moved it to a separate method and tested again, and the measurements were fixed on my machine. I disassembled the [Edit] My guess is the [Edit2] It is not the |
I wrote a benchmark to compare the difference between master and this PR. It doesn't explain why it makes the result match expected for net7 compared to net6, but it does show that execution is more stable.
code
public unsafe class Benchmarks
{
private readonly Random random = new Random(12345);
private readonly Consumer consumer = new Consumer();
private readonly Func<object> func;
public bool randomizeMemory;
public Benchmarks() => func = DefaultClass;
public object DefaultClass() => default;
[Benchmark]
public void RandomStackInline()
{
Span<byte> stackMemory = randomizeMemory ? stackalloc byte[random.Next(32)] : Span<byte>.Empty;
consumer.Consume(func());
Consume(stackMemory);
}
[MethodImpl(MethodImplOptions.NoInlining)]
private void Consume(in Span<byte> _) { }
[Benchmark]
public void RandomStackNoInline()
{
if (randomizeMemory)
{
ExecuteWithRandomStack();
}
else
{
Execute();
}
}
[MethodImpl(MethodImplOptions.NoInlining)]
private void ExecuteWithRandomStack()
{
byte* stackMemory = stackalloc byte[random.Next(32)];
consumer.Consume(func());
Consume(stackMemory);
}
[MethodImpl(MethodImplOptions.NoInlining)]
private void Execute()
{
consumer.Consume(func());
}
[MethodImpl(MethodImplOptions.NoInlining)]
private void Consume(byte* _) { }
} The reason must be a combination of the other code in |
Let's wait for a review by @adamsitnik |
@AndreyAkinshin Can we merge this? It's very low risk and removes potential unknown side-effects in the common case, even though we don't fully understand those side effects. I'm thinking a future follow-up will be to move the stackalloc into the source generator after the refactor in #2111 that moves the clock into the generated code (so we avoid the branch altogether). P.S. I wonder if that side effect could be contributing to those measurements you obtained in #2334. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unable to repro the issue (see the details below), but the change is very simple and it should definitely not hurt anything.
WithGlobalSetup.DefaultClass: Job-CYKZFQ(LaunchCount=9)
Runtime = .NET 8.0.3 (8.0.324.11423), X64 RyuJIT AVX2; GC = Concurrent Workstation
Mean = 0.000 ns, StdErr = 0.000 ns (NaN%), N = 130, StdDev = 0.000 ns
Min = 0.000 ns, Q1 = 0.000 ns, Median = 0.000 ns, Q3 = 0.000 ns, Max = 0.000 ns
IQR = 0.000 ns, LowerFence = 0.000 ns, UpperFence = 0.000 ns
ConfidenceInterval = [0.000 ns; 0.000 ns] (CI 99.9%), Margin = 0.000 ns (NaN% of Mean)
Skewness = NaN, Kurtosis = NaN, MValue = 2
-------------------- Histogram --------------------
[-0.500 ns ; 0.500 ns) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
---------------------------------------------------
WithoutGlobalSetup.DefaultClass: Job-CYKZFQ(LaunchCount=9)
Runtime = .NET 8.0.3 (8.0.324.11423), X64 RyuJIT AVX2; GC = Concurrent Workstation
Mean = 0.000 ns, StdErr = 0.000 ns (NaN%), N = 132, StdDev = 0.000 ns
Min = 0.000 ns, Q1 = 0.000 ns, Median = 0.000 ns, Q3 = 0.000 ns, Max = 0.000 ns
IQR = 0.000 ns, LowerFence = 0.000 ns, UpperFence = 0.000 ns
ConfidenceInterval = [0.000 ns; 0.000 ns] (CI 99.9%), Margin = 0.000 ns (NaN% of Mean)
Skewness = NaN, Kurtosis = NaN, MValue = 2
-------------------- Histogram --------------------
[-0.500 ns ; 0.500 ns) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
---------------------------------------------------
// * Summary *
BenchmarkDotNet v0.13.13-develop (2024-03-25), Windows 11 (10.0.22631.3296/23H2/2023Update/SunValley3)
AMD Ryzen Threadripper PRO 3945WX 12-Cores, 1 CPU, 24 logical and 12 physical cores
.NET SDK 9.0.100-alpha.1.23531.2
[Host] : .NET 8.0.3 (8.0.324.11423), X64 RyuJIT AVX2
Job-CYKZFQ : .NET 8.0.3 (8.0.324.11423), X64 RyuJIT AVX2
LaunchCount=9
Type | Method | Mean | Error |
---|---|---|---|
WithGlobalSetup | DefaultClass | 0.0 ns | 0.0 ns |
WithoutGlobalSetup | DefaultClass | 0.0 ns | 0.0 ns |
// * Warnings *
ZeroMeasurement
WithGlobalSetup.DefaultClass: LaunchCount=9 -> The method duration is indistinguishable from the empty method duration
WithoutGlobalSetup.DefaultClass: LaunchCount=9 -> The method duration is indistinguishable from the empty method duration
Fixes #2366