-
Notifications
You must be signed in to change notification settings - Fork 29
add Wit interfaces #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Luke, thanks for the PR! This looks great to me!
// `wasi:http/outgoing-handler` into a single `wasi:http/handler` interface | ||
// that takes a `request` parameter and returns a `response` result. | ||
// | ||
default interface incoming-handler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to split this out into separate PRs? I feel like it's going to be a heavy lift to do both client and server implementations in a single PR.
Of course, if it's ok to only have a partial implementation, that works too.
This looks great for a start. I had a couple of questions: The first is that is it ok to do a partial implementation of the spec when we start adding it to wasmtime? If not I worry that this is so comprehensive it will be a large change to get landed. The second is that I'm expecting this API will adapt and change as we (and others) gain experience with it. Thus my review is "is it good enough?" not "is it perfect?" I hope that works. |
@brendandburns Thanks for the review and good questions.
I think it's pretty normal for Wasmtime to land implementations incrementally and so adding a partial implementation makes sense to me, but I'd be interested to hear from @pchickey or @alexcrichton. It might be useful to open an early issue to discuss the rough shape of the end-state design, though, since I think by now we have a few distinct Wasmtime embeddings that are pretty interested in reusing a common shared HTTP implementation.
Yes, that is totally the appropriate level of review at this early stage. Iterating on the design in response to implementation feedback is an essential and expected part of both the core wasm and WASI phases process. |
Yes landing incrementally in Wasmtime is ok. The main threshold for landing an experimental feature in Wasmtime is "it doesn't affect anything else or otherwise has consensus about how it should be designed". The consensus part is nontrivial to acquire but I suspect an implementation here wouldn't affect other pieces so consensus shouldn't be necessary. |
Any chance that we can do a merge and iterate here? Looking through the comments it feels like the major questions (e.g. timeouts) have been resolved, and while there remain minor adjustments, this is the kind of thing that we can adapt as we implement (and indeed watching users actually use a specific implementation may help guide the design) |
Yeah, I was just thinking the same thing; I'll merge later today if there are no objections. |
|
||
/// Describes messages indicating serious errors. | ||
error, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably not the right PR, but should we have a higher log level (e.g. critical
for messages proceeding panic, etc.)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's right, wasi-logging would be the repo to file that under.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed WebAssembly/wasi-logging#9 to track this.
wit/types.wit
Outdated
// standard Contents. With Preview3, all of these fields can be replaced by a | ||
// single type definition: | ||
// | ||
// type body = stream<u8, option<trailers>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trailers
shouldn't be part of the body
, they are a separate thing (think: trailing headers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially I was going to simply have a method returning a (pseudo) future<trailers>
(as we discussed a while back), but then I realized that this introduces a tricky question: if I poll_oneoff
the trailers without concurrently consuming the body stream, does that (1) deadlock (b/c we're not consuming the body, there is backpressure, so we never get the trailers), (2) implicitly discard/ignore the body (so that we can get to the trailers)? Putting the trailers as the "final value" of the stream avoids this design problem. But agreed that "body
" is a misnomer if the type includes trailers. What do you think of this alternative naming that still keeps the same API shape, but avoids calling the composite "body"?
request: outgoing-request, | ||
options: option<request-options> | ||
) -> future-incoming-response | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find those pairings (incoming-request
and outgoing-response
, and outgoing-request
and incoming-response
) very non-intuitive and a bit redundant.
Could we consider the same prefix on the same connection, e.g. downstream-request
and downstream-response
, and upstream-request
and upstream-response
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFC 9110 suggests inbound
and outbound
as the preferred terms here, and defines upstream
and downstream
differently than you are using them: https://www.rfc-editor.org/rfc/rfc9110#name-intermediaries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but the popular implementations (e.g. NGINX and Envoy) disagree with the RFC about upstream
and downstream
, and I'm not aware of anything that actually uses the RFC terms.
I considered suggesting inbound
and outbound
, but it's yet another term, and it's actually used in service mesh environment to describe traffic direction in a global sense, and not from the perspective of a local endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our experience at fastly was that upstream
and downstream
were so confusing internally, and to our customers, that we stopped using those terms throughout our api and docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What did you end up replacing it with? I'm not a fan of downstream
/ upstream
myself (but mostly because the RFCs disagree with implementations), so I'm fine with other terms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that upstream/downstream is extremely confusing in practice (as evidenced by my own misuse of it below). I saw incoming/outgoing in one of the other proposals and it seemed more clear when you think of it from the component author's perspective. That being said, note that all incoming-
/outgoing-
prefixes would go away in Preview3 when we get to merge all these resources/interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most programming languages have standardized on request/response without incoming/outgoing b/c the context is pretty obvious from whether you are implementing a server or making a client call and programmers don't really get confused about that.
I think we should delete incoming/outgoing or downstream/upstream and just stick to request/response with the context providing the direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is definitely the goal with Preview 3. However, in a Preview 2 timeframe, where we want to do concurrent streaming of requests/responses and have blocking poll_oneoff
, the concrete signatures of the request/response resources have to be different depending on whether you are reading from them or writing to them, so we unfortunately can't wait and infer when the request or response hits a call boundary. I do believe it's possible to implement source-language libraries which are direction-agnostic (e.g., JS Response
) in terms of these these directional resource types, though, so that this is abstracted from the developer (e.g., when constructing a Response
, the impl would internally call new-outgoing-response
...). Avoiding this extra work and complexity is one of the main motivations for adding future
/stream
(and the stack-switching required to implement these in the underlying implementation).
wit/proxy.wit
Outdated
// This is the default handler to use when user code simply wants to make an | ||
// HTTP request (e.g., via `fetch()`) but doesn't otherwise specify a | ||
// particular handler. | ||
import default-upstream-HTTP: pkg.outgoing-handler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTTP world is already overloaded with terms regarding direction, with upstream
/downstream
often meaning opposite things in different documents and software. Could we avoid using multiple terms describing the same thing in this spec? Maybe let's stick to upstream
/downstream
, so this should be changed to:
import default-upstream-HTTP: pkg.upstream-handler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, my mistake mixing and then inverting terminologies. Fixed
protocol-error(string), | ||
status-error(u16), | ||
unexpected-error(string) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of weird enum, since it mixes status codes with non-HTTP errors. What's the intent here?
Also, the list is pretty limited. Perhaps we should include errors from the Proxy-Status
RFC (i.e. https://www.iana.org/assignments/http-proxy-status/http-proxy-status.xhtml)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question; I mostly just lifted this from one of the other proposals. Aligning with HTTP semantically-defined errors sounds generally good, but don't we already get that from the status-code
? Perhaps this is something best hashed out in a separate issue. For now, I've added a TODO to the type definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should get rid of status-error
in general. From the client perspective, if HTTP communication worked, that's not an error, obviously the client surfaces the received status code up to the caller, but it's up to the caller to decide if it is an error.
A classic example of this is whether 404
is an error or not. It really depends on the calling context, and if you force a programmer to do something special when they actually expect a 404
then it becomes problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #5 for the more general question.
wit/types.wit
Outdated
incoming-request-scheme: func(request: incoming-request) -> option<scheme> | ||
incoming-request-authority: func(request: incoming-request) -> string | ||
incoming-request-headers: func(request: incoming-request) -> headers | ||
incoming-request-consume: func(request: incoming-request) -> result<incoming-body> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The body is optional, and it's absent in 90%++ of requests.
However, it seems that this API assumes it's always present (presumably, a missing body is represented as an empty stream with EOF, but I don't think that's a good choice), and it requires calls to consume
and finish
to get trailers
, which adds a lot of unnecessary overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense. So would a solution be to have consume
return a result
containing a 3-case variant such as:
variant {
body(incoming-stream),
trailers(trailers),
none
}
(and similarly for write
, but outgoing)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #6 to consider further.
Thanks for the great feedback Piotr, you're mostly who I was hoping to hear from before, so I'll merge once we resolve these new comments. |
@lukewagner so I started to implement this, and I discovered two things:
This code won't compile: #include "proxy.h"
#include <stdio.h>
int main() {
default_outgoing_http_outgoing_request_t req;
default_outgoing_http_request_options_t opts;
default_outgoing_http_future_incoming_response_t res;
res = default_outgoing_http_handle(req, &opts);
printf("FOO: %d\n", res);
} But this code does: #include "proxy.h"
#include <stdio.h>
void http_handle(uint32_t arg, uint32_t arg0) {
}
int main() {
default_outgoing_http_outgoing_request_t req;
default_outgoing_http_request_options_t opts;
default_outgoing_http_future_incoming_response_t res;
res = default_outgoing_http_handle(req, &opts);
printf("FOO: %d\n", res);
} That seems problematic.
Thoughts on how to proceed would be useful/interesting. Thanks! |
@brendandburns can you try using the wasmtime = { workspace = true, features = ["component-model"] }
I want to hear @lukewagner's opinions on this but it seems to me that this is a good motivation for "optional imports and exports" I've been hearing. |
@Mossaka the existing code paths for the It's possible that there is some name mangling or something going on, but it's hard to debug. I'm happy to share the PR with the implementation if that's useful. |
@lukewagner As I've coded against this more, I'm wondering about whether we really want to store the request and headers options on the host side, rather than just serializing it across as part of the request. As this is currently written, you cross the guest/host boundary a minimum of 3 times:
This doesn't seem very ideal to me as it involves marshalling three different pointers across the boundary. It seems cleaner to just keep everything as strings, and only martial Thoughts? |
@brendandburns – If you want to look at a couple of example implementations of other proposals (i.e.,
~not sure if this helps, but figured I'd drop it here just in case 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this LGTM after Piotr's comments are addressed~ would be good to merge it and further iterate w/ subsequent PRs 👍
@danbugs I get that I could build my own binary that uses the component linker, but I don't really want to do that. I want this to be lit up in the main It's not good to have lots of fragmented binaries that each implement different wasi specs, we need to centralize them in a single runtime. |
(and just to be clear, I'm a firm +1 on LGTM/merge + iterate) |
Alright, I'll merge and then file issues for all the remaining issues here and of course everyone else should feel free to do likewise. |
And just to follow up on @brendandburns's question above: I think @Mossaka is right that this is related to the issue of optional exports mentioned here. The summary is: before too long, we should be able to mark imports and exports as That being said, the reason we made
This world could be included in the wasi-http proposal alongside the If anyone's interested, I'm happy to create a new issue to dig into this direction more. |
I have begun an implementation here: bytecodealliance/wasmtime#5929 It is massively incomplete and also hacky, but it (partially) works for some simple examples. |
This PR proposes some initial Wit interfaces and a
wasi:http/proxy
world using them that's mostly a synthesis of a lot of the existing proposals. As the proposal develops, I expect us to add new "bigger" worlds thanwasi:http/proxy
world (either here or in other proposals), but thiswasi:http/proxy
world seems like perhaps a good minimal concrete starting point. There's still a lot more to fill in to make this a proper WASI proposal, but perhaps this is enough for us to chew on now. Lastly, I still consider this very much in a draft state so happy to iterate with folks in this PR.(cc @PiotrSikora @brendandburns @eduardomourar)