Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: log(LogEntry) to handle different log types #2113

Merged
merged 2 commits into from
Apr 2, 2025
Merged

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Mar 31, 2025

Just toying around here to see if I can come up with an answer to #2101 (comment); @Stebalien I'm not sure what you mean by generalising the execution log but this is my take on it.

@github-project-automation github-project-automation bot moved this to 📌 Triage in FilOz Mar 31, 2025
@rvagg rvagg force-pushed the rvagg/dump-cache branch 2 times, most recently from 504363d to d2ab544 Compare April 1, 2025 05:23
Base automatically changed from rvagg/dump-cache to master April 1, 2025 23:18
fn log(&mut self, msg: String) {
self.trace(ExecutionEvent::Log(msg))
fn log(&mut self, entry: LogEntry) {
self.trace(ExecutionEvent::Log(match entry {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we could add an entirely new event type (not just string logs). New event types aren't breaking (personally, I'd add "put" and "get" events).

The more breaking version would be to change the structure of the event log. Right now the log preserves the tree structure of the calls but not the tree structure of the syscalls, but we could also preserve the structure of the syscalls. But I don't think that's critical right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I'd also just add these as separate methods? Maybe?

Or maybe we can avoid adding more methods to the kernel by recording put events at the blockstore level? But then we'd need to wrap the blockstore somehow which is... probably even more annoying.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we could add an entirely new event type (not just string logs)

OK, so this would involve changing the serialisation as it travels through FFI and would break existing clients. What's the general rule we've followed here for versioning? If we did that we would have to bump some number, is a semver-minor appropriate for that? What justifies a semver-major bump in here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well:

  1. Adding new event types isn't a breaking change, users will just ignore the ones they don't understand.
  2. All FFI serialization is internal to the FFI anyways.
  3. We allow API breaking changes in the FVM in minor releases, patch releases are for releases with no breaking changes. This violates semver, but we use major versions for network upgrades. See FVM versioning and technical debt #1724 and ref-fvm version philosophy  #358.

Copy link
Member Author

@rvagg rvagg Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so maybe just an internally tagged enum serialization, ["syscall","some debug logging from sys::debug::log"] and ["block_write",{"/":"bafy..."}] for what I want, then we extend the enum as far as we want with additional stuff.

Thoughts on what other log items might be useful in execution traces?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't think hard about the serialization format. Well, that'll be an issue in the Lotus API but the execution trace has a completely different structure there.

I was thinking:

diff --git a/fvm/src/trace/mod.rs b/fvm/src/trace/mod.rs
index 370828b1..32fbd045 100644
--- a/fvm/src/trace/mod.rs
+++ b/fvm/src/trace/mod.rs
@@ -1,3 +1,4 @@
+use cid::Cid;
 // Copyright 2021-2023 Protocol Labs
 // SPDX-License-Identifier: Apache-2.0, MIT
 use fvm_ipld_encoding::ipld_block::IpldBlock;
@@ -39,4 +40,15 @@ pub enum ExecutionEvent {
         state: ActorState,
     },
     Log(String),
+    Ipld {
+        op: IpldOperation,
+        cid: Cid,
+        size: u64,
+    },
+}
+
+#[derive(Clone, Debug)]
+pub enum IpldOperation {
+    Get, // open
+    Put, // link
 }

@rvagg rvagg force-pushed the rvagg/log-types branch from 114e5ba to 04e3f62 Compare April 2, 2025 05:05
@rvagg
Copy link
Member Author

rvagg commented Apr 2, 2025

OK, introduced a trace_ipld() on the CallManager that adds a new ExecutionEvent::Ipld, for both reads and writes. I wrestled for far too long to try and do it at the Blockstore level by introducing a wrapper that could record on reads and writes but I just couldn't reconcile the lifetimes and borrowing and all that guff so I gave up. There's a preload() call in here that I also track and it uses a Blockstore that gets passed in, but I'm going with the fairly-safe assumption that we're only going to try and load a single block, if any. We don't know whether we had a cache hit or not but I record it anyway. I'm not sure whether this block load will end up being useful or not but they're raw blocks anyway so shouldn't be too hard to filter out if we cared.

@rvagg rvagg marked this pull request as ready for review April 2, 2025 05:21
@rvagg rvagg force-pushed the rvagg/log-types branch from 04e3f62 to 13badcb Compare April 2, 2025 05:23
@github-project-automation github-project-automation bot moved this from 📌 Triage to ✔️ Approved by reviewer in FilOz Apr 2, 2025
@Stebalien
Copy link
Member

OK, introduced a trace_ipld() on the CallManager that adds a new ExecutionEvent::Ipld, for both reads and writes. I wrestled for far too long to try and do it at the Blockstore level by introducing a wrapper that could record on reads and writes but I just couldn't reconcile the lifetimes and borrowing and all that guff so I gave up.

Yeah, rust really expects exactly one owner. I'm not surprised.

There's a preload() call in here that I also track and it uses a Blockstore that gets passed in, but I'm going with the fairly-safe assumption that we're only going to try and load a single block, if any.

We don't even use that right now. We added that for user-deployable actors but it isn't hooked up anywhere.

@Stebalien
Copy link
Member

In terms of the tests, it's dying with exit code 143 which apparently means "SIGTERM".

@Stebalien
Copy link
Member

The hidden error only visible in the raw logs is:

2025-04-02T05:27:02.6034191Z ##[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

Of course, this is also visible to copilot so it was able to figure out the issue immediately....

@Stebalien Stebalien merged commit 709e0d8 into master Apr 2, 2025
14 of 15 checks passed
@github-project-automation github-project-automation bot moved this from ✔️ Approved by reviewer to 🎉 Done in FilOz Apr 2, 2025
@Stebalien Stebalien deleted the rvagg/log-types branch April 2, 2025 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🎉 Done
Development

Successfully merging this pull request may close these issues.

2 participants