Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable tracing in the execution layer #21165

Merged
merged 14 commits into from
Feb 13, 2025
Merged

Enable tracing in the execution layer #21165

merged 14 commits into from
Feb 13, 2025

Conversation

awelc
Copy link
Contributor

@awelc awelc commented Feb 10, 2025

Description

The main idea here is that we have a client (e.g., the upcoming replay tool) at some point calling the execute_transaction_to_effects from the Executor trait defined in sui/sui-execution/src/executor.rs to execute a PTB. We want execution of a PTB to be traced, the trace "returned" to the client which is responsible for delivering this trace to the user (e.g., by storing in to disk). In other words, we want the executor to be responsible for execution and tracing itself, and the client to be responsible for all the remaining handling of the generated trace (e.g., I/O operations). There are arguably many ways to accomplish this and this PR proposes one of them - passing optional MoveTraceBuilder as an additional argument to execute_transaction_to_effects.

Ultimately, we want execution of the entire PTB to be traced, but in this PR we only trace execution of Move calls. Tracing of the remaining PTB commands will be added at a later time via external TraceEvents, without further changing the interface being modified in this PR.

MoveTraceBuilder creates an instance of MoveTrace, which contains a list of trace events including TraceEvent::External. This type of event can represent an arbitrary value, including arguments and return values of PTB commands.

@awelc awelc self-assigned this Feb 10, 2025
Copy link

vercel bot commented Feb 10, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 13, 2025 0:55am
2 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Feb 13, 2025 0:55am
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Feb 13, 2025 0:55am

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way it's piped through the executor seems fine, but the two main themes to my questions/comments are:

  • Maybe it's worth plumbing it through to all the special case places we may want to support tracing + replay in now when the compiler is directing us to what needs to change, rather than later when we'll need to figure out which code path to switch up the None to a Some(_) in.
  • There's something funky going on, lifetime-wise, with the trace builder that's probably worth cleaning up -- it should be possible to pass it through to wherever it's needed without having to repeatedly unwrap it, pass it down and then rewrap it.

// This function is executed in a loop which makes passing `Option<&mut MoveTraceBuilder>`
// as an argument impossible. Instead, we "rebuild" a value of this type here
// too make Rust's borrow checker happy.
// TODO: Use some Rust magic to avoid this?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of things I would try here:

  • Is there a reason to use &mut Option<&mut MoveTraceBuilder> instead of just Option<&mut MoveTraceBuilder>, if not, simplifying that may help Rust decipher the lifetimes here. Perhaps the easiest of all is to pass in a &mut Option<MoveTraceBuilder> from the very top?
  • Alternatively/additionally using an explicit lifetime parameter -- &'t mut MoveTraceBuilder -- may also help Rust realise that it's merged too many lifetimes together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Is there a reason to use &mut Option<&mut MoveTraceBuilder> instead of just Option<&mut MoveTraceBuilder>

I was trying to explain it in the comment but clearly failed... This function is executed in a loop:

        for (idx, command) in commands.into_iter().enumerate() {
            let start = Instant::now();
            if let Err(err) = execute_command::<Mode>(
                &mut context,
                &mut mode_results,
                command,
                &mut trace_builder_opt,
            ) ...

With trace_builder_opt being Option<&mut MoveTraceBuilder>, the borrow checker complains about it, and cloning this value is also not working due to the Option holding a reference. It does seem like there should be a better way to handle this but I could not figure it out...

Perhaps the easiest of all is to pass in a &mut Option<MoveTraceBuilder> from the very top?

I did not think about this as I was fixated on Option<&mut MoveTraceBuilder> being needed down the line but this should work (and let us avoid the loop-related issue) with the as_mut conversion when needed.

@@ -951,6 +993,7 @@ mod checked {
vec![],
vec![],
/* is_init */ true,
None, // TODO: add tracing for init functions?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to do this to begin with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about it from the trace-debugging perspective (it was not super clear to me how/where/when to visualize these as part of a PTB execution flow) but these traces obviously do not have to be consumed by the trace debugger only, so they should indeed be generated here (even we decide for the trace debugger to ignore them initially).

@@ -903,6 +911,7 @@ mod checked {
tx_ctx,
gas_charger,
advance_epoch_pt,
None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would anything go wrong if we added support for tracing system transactions? In the context of Replay that should be possible, and it's something that would certainly help us debug issues during reconfiguration etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not think that replay tool will be able to handle these but I guess everything that's executed from execution_loop is fair game. Will add these!

@@ -963,6 +964,7 @@ mod checked {
function_name: &IdentStr,
ty_args: Vec<Type>,
args: Vec<impl Borrow<[u8]>>,
tracer: Option<&mut MoveTraceBuilder>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the tracer be part of the ExecutionContext?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how useful it would be as I am not sure it will be used anywhere else other than runtime's execute_function_bypass_visibility function

@awelc awelc temporarily deployed to sui-typescript-aws-kms-test-env February 11, 2025 19:37 — with GitHub Actions Inactive
@awelc
Copy link
Contributor Author

awelc commented Feb 11, 2025

I think the way it's piped through the executor seems fine, but the two main themes to my questions/comments are:

  • Maybe it's worth plumbing it through to all the special case places we may want to support tracing + replay in now when the compiler is directing us to what needs to change, rather than later when we'll need to figure out which code path to switch up the None to a Some(_) in.
  • There's something funky going on, lifetime-wise, with the trace builder that's probably worth cleaning up -- it should be possible to pass it through to wherever it's needed without having to repeatedly unwrap it, pass it down and then rewrap it.

I addressed both issues:

  • added tracing parameter to all functions accessible from execution_loop
  • changed tracer parameter type from Option<&mut MoveTraceBuilder> to &mut Option<MoveTraceBuilder> which fixed the argument passing problem

Copy link
Contributor

@tzakian tzakian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM although I think I'd like to cleanup the Move deps first before landing this.

Cargo.toml Outdated
Comment on lines 588 to 590
move-trace-format = { path = "external-crates/move/crates/move-trace-format" }
move-proc-macros = { path = "external-crates/move/crates/move-proc-macros" }
enum-compat-util = { path = "external-crates/move/crates/enum-compat-util" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a "my bad" situation here -- I don't think move-trace-format needs all these different crates (or at least doesn't use them). So I think removing them from the move-trace-format crate should allow you to remove them here and just import the move-trace-format crate.

@@ -217,6 +217,12 @@ impl MoveTrace {
}
}

impl Default for MoveTrace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh clippy...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this clippy note disabled in external-crates fwiw. So we can remove this if we want

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this clippy note disabled in external-crates fwiw. So we can remove this if we want

Fwiw, I got clippy complaining about it in CI...

@@ -138,6 +138,12 @@ impl TraceState {
}
}

impl Default for TraceState {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let mut result = context
.execute_function_bypass_visibility(
let mut result = if trace_builder_opt.is_some() {
context.execute_function_bypass_visibility(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since it's using the same entrypoint this if is no longer needed here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I was looking at it without actually seeing it...

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well!

@awelc awelc merged commit fd40b0f into main Feb 13, 2025
47 checks passed
@awelc awelc deleted the aw/execution-enable-tracing branch February 13, 2025 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants