Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed WIT for tracing portion of wasi-observe #15

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

calebschoepp
Copy link
Collaborator

This PR proposes WIT for what I think the tracing portion of wasi-observe should look like. Initial discussion of high level approaches was had here and this WIT is the outcome of that. I have about 50% of a working host implementation for this WIT in Spin so I'm pretty confident this is a workable approach implementation-wise.

Here are some of the design principles I followed while writing this WIT:

  • Closely follow the OpenTelemetry tracing specification to ease compatibility with that ecosystem.
  • But, not perfectly. I don't think wasi-observe should be 100% coupled to the OpenTelemetry ecosystem. For example here are just a few of the many things we're not following.
    • "There MUST NOT be any API for creating a Span other than with a Tracer."
    • "[The new span API] MUST NOT accept a Span or SpanContext as parent, only a full Context."
  • Liberally use option<T> rather than providing multiple versions of methods e.g. having start-implicit: static func(name: string) and start-with-options: static func(name: string, options: start-options) and ...

I'm happy to discuss the validity of these design principles and whether we want to take a different approach. They are not givens.

wit/traces.wit Outdated
/// Signals that the operation described by this span has now ended.
///
/// If a timestamp is not provided then it is treated equivalent to passing the current time.
end: func(timestamp: option<datetime>);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is dropping the resource equivalent to end(none)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. That should probably be documented. I need to do some more thinking about the lifetimes of resources. B/c if we're forced to end a span when a resource is dropped then we might be breaking some of the OTel semantics.

Copy link

@jcchavezs jcchavezs Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have a method called "abandon" instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what you'd want to have abandon be an alternative to?

wit/traces.wit Outdated
}

/// A key-value pair describing an attribute.
record key-value {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
record key-value {
record attribute {

🤷

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also feel 🤷 on this. The Rust and Go OTel implementations call this KeyValue. But, the spec talks about it as an attribute. I think all things considered I would lean to sticking with key-value. Especially given this line of the spec:

An Attribute is a key-value pair

wit/traces.wit Outdated
/// Represents a unit of work or operation.
resource span {
/// Starts a new span with the given name, parent, and options.
start: static func(name: string, parent: span-parent, options: option<start-options>) -> span;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If parent is "almost always" implicit it could maybe move inside options? 🤷

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

wit/traces.wit Outdated
Comment on lines 139 to 143
/// The trace that this `span-context` belongs to.
type trace-id = tuple<u64, u64>;

/// The id of this `span-context`.
type span-id = u64;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the disadvantages of storing these types in u64's is that we need to be clear whether we're using little or big endian encodings. Other options here would be:

  • tuple<u8, u8, u8, u8, u8, u8, u8, u8,> — Pretty gross type.
  • list<u8> — Can't statically guarantee that it will contain the right number of bytes.
  • Use a fixed length array — Doesn't yet exist in WIT.
  • string (hexadecimal encoding) — Maybe this is a better option?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer it to be a string as well.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a fixed length array — Doesn't yet exist in WIT.

Might be worth a look at WebAssembly/component-model#384 — it sounds like fixed-length arrays may exist soon?

wit/traces.wit Outdated
/// Whether this `span-context` was propagated from a remote parent.
is-remote: bool,
/// The `trace-state` for this `span-context`.
trace-state: string,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't trace-state should be of type trace-state that you've defined in line 152?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

wit/traces.wit Outdated
/// `link`'s for the new `span`.
links: option<list<link>>,
/// When the `span` should begin. If this is not provided it defaults to the current time.
timestamp: option<datetime>,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in other places: I know that open telemetry has setting time as an option, but should we consider not supporting it? A side benefit would be that this wit wouldn't have a dependency on wasi::clocks...

I am curious to learn from others in the community about use cases and their opinions on whether this flexibility is needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally view the dependency on wasi::clocks as not a very large downside and think that it is worth supporting setting these timestamps. But, I'm also curious on what the sentiment of the community is.

wit/traces.wit Outdated
/// Represents a unit of work or operation.
resource span {
/// Starts a new span with the given name, parent, and options.
start: static func(name: string, parent: span-parent, options: option<start-options>) -> span;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

wit/traces.wit Outdated
add-link: func(link: link);

/// Override the default `span` status, which is unset.
set-status: func(status: status);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider having status as an optional field in start-options and renaming this to update-status

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea with status is that 90% of the time you should leave it unset. In the cases where you do set it, it can only be done later in the operation when you know its outcome. Source.

wit/traces.wit Outdated
Comment on lines 139 to 143
/// The trace that this `span-context` belongs to.
type trace-id = tuple<u64, u64>;

/// The id of this `span-context`.
type span-id = u64;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer it to be a string as well.

wit/traces.wit Outdated
/// The `span`'s parent is the current active span. The current active span is the most recently created and non-closed span. If no spans have been started in the guest this may be a span in the host.
implicit,
/// The `span` is a root span and should have no parent.
is-root,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to discuss the implications of making the span-parent settable at start. Here is the way I understand it

  1. Implicit means that the span will appear in a logical, expected place as a child of the previously executing span.
  2. is-root will detach this span from the previous one and will likely be the parent of a net new trace tree?
  3. span-context will make this a child of a different span, possibly belong to a different trace tree.

As of now, I find myself being the most uncomfortable about 3, as I'm unable to think of a usecase where this will apply.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're understanding of the three options is correct. I largely designed it this way to follow the spec. However, I'm also starting to think that supporting options 2 and 3 is a mistake (at least for the first version of this WIT).

With option 2 I can sort of imagine a scenario where a user would want it. For example maybe they're running an app in Wasm that pulls work items off a queue. They might want the spans of pulling the work item to be part of the host trace — but then have the spans for actually doing the work be their own root span and thus separate trace.

With option 3 I can't imagine a way that it would be useful, but I can imagine lots and lots of ways that it could break things.

Overall I'm becoming skeptical that options 2 and 3 make sense in the context of a guest. We can always iterate so I'd prefer to start with something simple and would advocate we just remove the choice altogether and just always make it implicit.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for option 2 an example was a scenario that I'd encountered previously, in which it was convenient to have the initiator of a set of tasks and the work done to fulfill per task be in separate traces. So, it has some utility.

That said, I am onboard with keeping the wit simple by only supporting implicit for now and evolving from there.

@calebschoepp
Copy link
Collaborator Author

@chrisdickinson @brooksmtownsend wondering if either of you have had a chance to look at this.

- Rename traces interface to tracer
- Change some types
- Move start to the interface rather than being a constructor
- Remove abillity to specify arbitrary span context as parent

Signed-off-by: Caleb Schoepp <[email protected]>
}

/// Configuration for starting a `span`.
record start-options {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this ought to be a resource to provide better forward compatibility for adding options.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you have fields on a resource or would all the data be stored in the host table?

Copy link

@lann lann Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No fields for resources; it would all be stored by the resource owner, which would typically be the host.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brooksmtownsend what do you think about this?

}

/// The `status` of a `span`.
variant status {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could arguably represent this as an option<result<_, string>> 🤷

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for type status = option<result<_, string>>;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. One of the things I like about the variant is it's option correspond directly with the OTel terms that have specific semantic meanings.

Do y'all still think it is worth it using the more WIT native type status = option<result<_, string>>;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about this one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @calebschoepp here, I like the direct tie to the OTEL meaning

Comment on lines +104 to +106
%float64(float64),
/// A signed 64 bit integer value.
%s64(s64),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe name these float and int 🤷

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is it still float64 or f64 ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using float64 is just an artifact of Spin being on an old wit-bindgen version.

@@ -0,0 +1,151 @@
interface tracer {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you throw some documentation on this interface?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about just calling this trace? wasi:observe/trace.start feels nice. Don't want to bikeshed on the name too hard at this stage so feel free to just ignore

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had it named something very similar to that — traces.wit.

I changed it to tracer.wit on the premise that the interface itself is acting as the Tracer. In the future if/when we support WIT templating then templating would act as the [TracerProvider](https://opentelemetry.io/docs/specs/otel/trace/api/#tracerprovider).

I imagine that we would also have a similar pattern for meter.wit and logger.wit.

Copy link
Collaborator

@brooksmtownsend brooksmtownsend left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @calebschoepp, fantastic job here! I really like the direct ties to the traces page from the otel docs. Feels like a great approach.

I had mostly nits and a couple of comments that I'm interested in and happy to talk sync in the next meeting too.

(p.s. thank you for the patience on the review here, I lost this in my inbox)

Comment on lines +88 to +89
/// The attribute name.
key: key,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I generally prefer not using simple type aliases in WIT since it's just another mental lookup, thoughts on just making this a string?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same thoughts later about trace-id / span-id etc etc but just to keep the comment spread simple I'll raise that here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key being type aliased was very much a result of me closely mirroring the OTel spec. Thinking about it some more I agree that we should just remove this alias and have key-value defined as:

/// A key-value pair describing an attribute.
record key-value {
    /// The attribute name.
    key: string,
    /// The attribute value.
    value: value,
}

As for trace-id / span-id, good catch. I also think that those should no longer be type aliased. Earlier they were some weird data structure like tuple<u64, u64> which made an alias more relevant.

/// Identifying trace information about a span that can be serialized and propagated.
record span-context {
    /// The `trace-id` for this `span-context`.
    trace-id: string,
    /// The `span-id` for this `span-context`.
    span-id: string,
...
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good! Looks like I was missing a bit of history but glad it's landing in a simpler place 😄

/// If a key already exists for an attribute of the Span it will be overwritten with the corresponding new value.
set-attributes: func(attributes: list<key-value>);

/// Adds an event with the provided name at the curent timestamp.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
/// Adds an event with the provided name at the curent timestamp.
/// Adds an event with the provided name at the current timestamp.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little surprised this made it past my VS Code spelling extension

wit/tracer.wit Show resolved Hide resolved
wit/tracer.wit Show resolved Hide resolved
}

/// The `status` of a `span`.
variant status {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @calebschoepp here, I like the direct tie to the OTEL meaning

@@ -0,0 +1,151 @@
interface tracer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about just calling this trace? wasi:observe/trace.start feels nice. Don't want to bikeshed on the name too hard at this stage so feel free to just ignore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants