Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gas accounting for seal proof verification #1852

Open
3 tasks
anorth opened this issue Aug 21, 2023 · 14 comments
Open
3 tasks

Fix gas accounting for seal proof verification #1852

anorth opened this issue Aug 21, 2023 · 14 comments
Assignees
Labels
P1 P1: Must be resolved Topic: Syscalls

Comments

@anorth
Copy link
Member

anorth commented Aug 21, 2023

Upcoming changes to built-in actors to implement direct data onboarding will invoke the batch_verify_seals syscall directly from the miner actor. In the past, this syscall has been invoked only from the power actor's cron handler. It does not charge the correct amount of gas, though there's no apparent reason why it can't. This syscall should instead charge the correct amount of gas. Doing so won't have any effect on the existing use since the cron originator has unlimited gas budget.

Fixing this is necessary for the deployment of direct data onboarding. I don't think it needs a FIP since the end result is just that the syscall works as expected, with no change to existing on-chain observable behaviour. Any necessary things can be specified in the direct onboarding FIP.

  • Fix the comment in kernel/mod.rs by basically deleting it. The call is not privileged and should charge gas (reference to the pre-payment is both wrong and irrelevant here).
  • Delete the comment in kernel/default.rs
  • Fix the gas calculation in kernel/default.rs. A a minimum this might involve changing the price list item for verify_seal_base from 2000 to a real number for per-proof verification.

Changing the verify_seal_base price to 34721049 the amount hardcoded in the power actor, would then price this the same as the current cost of seal proof validation. However this price might (?) include an empirical discount that assumes some unknown number of additional proofs are submitted in the same epoch which can be verified in parallel. But a fixed price per item also obviously doesn't include a discount for the known number of proofs that actually are submitted when this is called directly from the miner.

FYI @ZenGround0 @Kubuxu

@Stebalien
Copy link
Member

Couple of things to note:

  1. The current gas price assumes 8-way parallelism (I think?), as far as I know. The gas price will have to be tiered (i.e., charge per 8 proofs, not per proof).
  2. There may be some per-invocation overhead that's not being considered. This syscall effectively spins up a worker pool to process proofs in parallel.
  3. We'll need to very carefully audit this function.

@Kubuxu
Copy link

Kubuxu commented Aug 30, 2023

There are two things in the current price, these is empirical discount from parallelism, on the other hand there were some improvements to proof verification performance since then.

There is also the debate whether the parallelism should be discounted because unoccupied threads are useful for processing alternative tipsets.

@arajasek
Copy link
Contributor

arajasek commented Sep 8, 2023

I've benchmarked proof verification calls using code in #1882.

I propose doing away with any parallelism discounts, and simply charging for the cost of verifying the proofs. I propose we continue to verify them in parallel (assuming multiple proofs), but charging just based on the number of proofs.

Here are the results in nanoseconds:

For category 32 GiB (based on 48745 results): 
  - Average: 4048979.97 
  - Median: 4080682.00 
  - 60th Percentile: 4178855.00
For category 64 Gib (based on 30431 results): 
  - Average: 3927112.38 
  - Median: 3967278.00 
  - 60th Percentile: 4062714.00

These are obtained after stripping out all failures (bad proofs), and stripping out the top and bottom 2% of results. Neither of those operations changed the result significantly.

I propose using the 60th percentile value of the 32 GiB sector for all sector sizes, which would yield a cost of 41.78M gas per proof verification. This is only slightly more than the current cost of 34.72M gas per proof.

@Stebalien
Copy link
Member

This seems like a reasonable approach. If it becomes an issue, we can likely change it later. The core issue here is that requiring additional cores has an opportunity cost.

@arajasek
Copy link
Contributor

arajasek commented Sep 8, 2023

Yup. I'm also generally thinking that it'll be easier to lower the cost than increase it moving forward.

@jennijuju
Copy link
Member

I can't verify the numbers, however, I agree with the choices towards no-parallization assumption approach.

@Kubuxu
Copy link

Kubuxu commented Sep 9, 2023

These are obtained after stripping out all failures (bad proofs)

Were there any high-side outliers in case of bad proofs?

I propose doing away with any parallelism discounts, and simply charging for the cost of verifying the proofs

I agree with the approach as there is cost to parallelism that is hard to estimate and capture.

@arajasek
Copy link
Contributor

Were there any high-side outliers in case of bad proofs?

No, there were only three bad proofs, and they were all 43M gas.

@arajasek
Copy link
Contributor

Final proposal: 42M gas for both 32 GiB and 64 GiB sectors.

@irenegia
Copy link

irenegia commented Jan 8, 2025

@Stebalien is the 42M gas units both for 32GiB and 64GiB sectors still a good number?
(ie, can we go with that for the implementation?)

cc @Kubuxu @rvagg

@Kubuxu
Copy link

Kubuxu commented Jan 8, 2025

32GiB is more expensive than 64GiB and looking at #1852 (comment) 42M would cover that. So yes.

@rvagg
Copy link
Member

rvagg commented Jan 9, 2025

In case this is helpful to anyone else, this adjusts lotus-shed msg on current lotus master (0e4857a11b) to act as if this fee were properly applied:

diff --git a/cmd/lotus-shed/msg.go b/cmd/lotus-shed/msg.go
index 4de7789a8..c78b0b919 100644
--- a/cmd/lotus-shed/msg.go
+++ b/cmd/lotus-shed/msg.go
@@ -79,6 +79,21 @@ var msgCmd = &cli.Command{
                                return xerrors.Errorf("replay call failed: %w", err)
                        }
 
+                       var fixSealPrice func(trace types.ExecutionTrace)
+                       fixSealPrice = func(trace types.ExecutionTrace) {
+                               for i := range trace.GasCharges {
+                                       if trace.GasCharges[i].Name == "OnVerifySeal" && trace.GasCharges[i].ComputeGas == 2000 {
+                                               // should be 42M
+                                               trace.GasCharges[i].ComputeGas = 42_000_000
+                                               trace.GasCharges[i].TotalGas += 42_000_000 - 2000
+                                       }
+                               }
+                               for i := range trace.Subcalls {
+                                       fixSealPrice(trace.Subcalls[i])
+                               }
+                       }
+                       fixSealPrice(res.ExecutionTrace)
+
                        if cctx.Bool("exec-trace") {
                                // Print the execution trace
                                color.Green("Execution trace:")

@rvagg
Copy link
Member

rvagg commented Jan 10, 2025

@Stebalien @Kubuxu can you explain the calculations in #1852 (comment) that lead to 42M gas

  • first, do we need to be using standardised hardware to run these benchmarks, is there a standard suite of hardware we're basing these things off?
  • or: is benchmarking just a way of gut-feel measuring the scale of gas charges against each other so if we want to tweak these things in future we'd just be measuring them against other values?
  • is there a direct translation from nanoseconds to gas? there seems to be a 10x multiplication going on here to get to 42M, is it as simple as that for a straightforward execution like proof verificaiton?

@Stebalien
Copy link
Member

The direct translation is 10gas/nanosecond. In terms of standardization, I often ran the benchmark on both my laptop and the hetzner machine I had at the time and generally picked the slower one. But we don't have any "standard hardware" (and, TBH, my laptop usually won the race due to having better single-core performance).

In my experience, the difference between machines wasn't too terrible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 P1: Must be resolved Topic: Syscalls
Projects
Status: Todo
Status: In Progress
Development

Successfully merging a pull request may close this issue.

7 participants