-
Notifications
You must be signed in to change notification settings - Fork 63
Add Body::poll_progress #90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@sfackler would it be possible to include a timeout example with this? Or maybe add that example in the doc comment I am curious to see how this would be used. |
We could add a |
Maybe even at least an example so tokio doesn't need to be a public dep. |
Here's a TimeoutBody implementation (untested): #[pin_project]
pub struct TimeoutBody<B> {
#[pin]
inner: B,
#[pin]
timer: Sleep,
timeout: Duration,
waiting: bool,
}
impl<B> Body for TimeoutBody<B>
where
B: Body,
{
type Data = B::Data;
type Error = TimeoutErro<B::Error>;
fn poll_frame(
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Result<Frame<Self::Data>, Self::Error>>> {
if let Poll::Ready(o) = self.as_mut().project().inner.poll_frame(cx) {
*this.waiting = false;
return Poll::Ready(o.map(|r| r.map_err(TimeoutError::Inner)));
}
self.is_healthy(cx)?;
Poll::Pending
}
fn is_healthy(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Result<(), Self::Error> {
let this = self.project();
if !*this.waiting {
this.timer.reset(Instant::now() + *this.timeout);
*this.waiting = true;
}
if this.timer.poll(cx).is_ready() {
return Err(TimeoutError::TimedOut);
}
Ok(())
}
fn is_end_stream(&self) -> bool {
self.inner.is_end_stream()
}
fn size_hint(&self) -> SizeHint {
self.inner.size_hint()
}
}
pub enum TimeoutError<E> {
Inner(E),
TimedOut,
} |
@seanmonstar thoughts on this? |
http-body/src/lib.rs
Outdated
/// `poll_frame` calls and report an error from `poll_healthy` when time expires. | ||
/// | ||
/// The default implementation returns `Ok(())`. | ||
fn poll_healthy(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Result<(), Self::Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted this to be more poll-like we could have a return value of Poll<Result<Void, Self::Error>>>
.
Co-authored-by: nickelc <[email protected]>
ce40c24
to
395b18e
Compare
I've updated this to move from |
hyperium/http-body#90 proposes adding a `Body::poll_progress` method to the `Body` trait. This PR uses a fork of hyper that uses this proposed API when awaiting stream send capacity. This supports implementing timeouts on streams and connections in an unhealthy state to defend servers against resource exhaustion.
hyperium/http-body#90 proposes adding a `Body::poll_progress` method to the `Body` trait. This PR uses a fork of hyper that uses this proposed API when awaiting stream send capacity. This supports implementing timeouts on streams and connections in an unhealthy state to defend servers against resource exhaustion.
We (Linkerd) are interested in moving this proposal forward. I've been testing this with patched versions of http-body 0.3 and Hyper 0.14 and the results are promising. I'm using this to implement a middleware that enforces a progress timeout to cancel stuck streams. What does the process look like for finalizing this proposal? Is there anything I can do to help? |
Thanks for trying it out, @olix0r! I'm glad to have at least 2 use cases for something so fundamental. We can move this forward. In my prep for doing so, I went back and read the previous conversations, and also the poll_progress post. I think the problem and solution that withoutboats describes is similar, but different enough that perhaps we shouldn't use the same name/mechanism. That's because, this feature isn't describing making progress on the body separate from producing a frame. It's rather to propagate cancelation while waiting on backpressure to clear up. It feels closer to At the same time, I'm writing up a longer blog post about this feature, I'll share a draft with you soon. |
That seems like a reasonable-enough name to me, though it might be a bit strange to have |
Hm, thanks for pointing that out. It made me think through the problem a bit more I at first figured we could just make it
|
A default implementation that just returns I don't think it would indicate closure, just drive any background IO (like poll_progress does). For convenience, it allows errors to be returned directly, but an error out of poll_healthy or whatever we call it would be equivalent to that error being returned from poll_frame. |
In that sense, it is really pretty similar to Boats's poll_progress, just fallible. We could make it infallible and force the error to come out of poll_frame but that just seems like it'd make the implementation more annoying for no real benefit. |
So you find that it was needed to make background progress, too? I didn't think that was a goal of the method. Just to detect closure while a frame isn't needed, such as by polling a |
Polling a sleep is background progress IMO! :) My initial use case was purely around detecting disconnects/timeouts and things like that, but Boats's post on poll_progress made me feel like there's no reason you couldn't have some other body implementation that wanted to do some background IO. For example, if you're proxying you may want to internally pull data from the upstream connection while the downstream connection is idle. |
That's a fair point. Perhaps the naming is fine then. Or at least, worth considering if Now, I think one more remaining question is about return type. I do think a poll method should return |
I think that In practice, the only thing that callers would actually look for is the EDIT: Actually, I think we have to require that it's fused to be able to use it properly. |
Added a few bits of docs on return values, and poll_progress implementations to http-body-util combinators. |
@seanmonstar Are you comfortable with this PR? Is there anything else to consider? |
I believe generally yes. I got part way through a write-up to publish as a blog post, to explore the area more and get more feedback, since I think it's a sufficiently interesting change to some fundamental crates. I still plan to finish that up, I just had to pause as I dealt with some other contracting work. |
@seanmonstar 👋 Is there anything we can do to help move this PR forward? |
Thanks for the reminder! I've added it to my active list, I think I can get back on the write-up. Could I send a draft for review, say, next week? |
@seanmonstar I'll be camping next week but I should have some time to review after Dec 19th. Thanks! |
I'm also happy to take a look. |
@seanmonstar Hey there. Just nudging this along. We've finally crossed the bridge to hyper 1.0, so now I'm extra eager to get this through :) |
Yep, adding to my plate for this week. |
@sfackler @olix0r I've written up a draft blog post that I'd like to publish, since I feel like this change is fairly fundamental and could use as much refinement as necessary. If you'd like to review and suggest any edits, I'd appreciate it! If not, I can publish as-is: https://hackmd.io/@seanmonstar/HkHNiA5Akx |
Looks good to me! The one thing that might be worth mentioning is that in addition to being useful to propagate cancellation, it can also be used to allow the logic to do some work while Hyper's waiting to write the previous block out. For example, you could be gzip-compressing the next block. |
Hm, I'm hesitant to recommend that. As I think about it, I think that could be an anti-pattern. If the write-side is applying backpressure, then you probably don't want to do extra work producing a new frame, it might not ever be needed. Well, I suppose you could chose to do so opportunistically, with a preference for producing faster at the cost of possible extra wasted work... (Also, for anyone else looking, that draft link supports inline comments and edit suggestions as long as you're logged in. If there's no other feedback, I could publish early next week.) |
Yeah, you'd need to make sure you put a bound on the amount of background work. Definitely not necessary to talk about though! |
Alright, I did another edit pass (mostly just addressing grammar, repeat words, and slight styling). Unless anyone wants to discuss further, I'll be posting either Tuesday or Wednesday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am excited to see this proposal moving forward.
i have one small question, about whether Body
implementors can ever explicitly depend on poll_progress()
being called.
Published: https://seanmonstar.com/blog/body-poll-progress/ I'll allow for some time to receive any other feedback. |
It sounds like we could just return |
It's done purely so you can |
FWIW, that would also mean that if you did do |
True, but you have to handle the waits forever case anyway in case a body isn't using the default implementation. |
I agree with your point about backpressure - if it's not a good idea to actually make any real progress in the function, |
Coming from Reddit, I had to read @seanmonstar blog post, and withoutboats' one to really understand what On the other hand, I directly understand what select! {
dst.write(frame) => {
// continue
},
reason = body.closed() => {
tracing::debug!(?reason, "closed");
dst.abort();
return;
}
} than the one presented in the blog post, as I find it more intuitive. Also, regarding withoutboats proposal, the signature was |
Hmm ... but the stream is not closed until you get an empty frame from |
I understood the counter argument of the blog post as:
However, I'm not an expert about hyper, so I don't know if user may write this kind of side-task or not, but I would say they may not, so the counterintuitive aspect is quite negligible indeed. |
TL;DR I'd like to bikeshed the name in the direction of I've rewritten this comment a few times, and I'm still not happy that it's fully expressing my views, but I'm submitting anyway because I don't think I'll get clearer. The concern I have with calling this method I'd thus prefer to call it something like Separately, is there a guarantee that if Similarly, is it guaranteed that if I called |
That's not quite accurate - one kind of "progress" could be checking for a pending error, but it could also be used to do some amount of work in a pipelined fashion: #90 (comment).
I think that'd be unspecified in the same way it is for e.g. fallible streams. The expectation is that if it returns an error you're going to propagate that and stop polling the body.
Like above, that seems like it'd be constraining the implementation in a direction that shouldn't really matter in practice. |
This then feels like you're tying two separate things into one function:
And I'm not sure that I like having both of those bundled together into a single function; I'm on a single-core Cortex-A8 platform, and often short on CPU. Being able to abort a processing pipeline early because the thing POSTing to me isn't going to finish sending chunks of data is useful, because it saves CPU, but bringing work forward is not, since I've got other uses for my CPU core if that chunk's not needed right now.
For these, I mostly want the guarantees, or lack thereof, to be very clearly documented, the way unfused streams are very clearly documented, so that there's documentation to point to if someone write a TOCTOU mistake. Something like: """ Further, note that both Basically, enough text that someone who's making bad assumptions about what they can expect of an implementation gets an up-front warning that they're holding it wrong. |
Giving feedback based on https://seanmonstar.com/blog/body-poll-progress/, this change makes no sense at all. The operation conceptually copies data from a source to a sink until an error or EOF on the source happens. Hence, once you read some data from the source, then you MUST wait for the sink and write it to the sink, otherwise you are incorrectly losing data. Whether the source will return an error after returning data has no bearing whatsoever on the fact that that data MUST be sent to the sink, and thus you must wait for sink. There are however there reasonable proper ways to change the operation:
It looks like the change you are looking for is the third one. Hence, you should not add this absurd interface and instead add a cancellation token to the source-to-sink copy operation and change the code so that both await operations terminate if the token is signalled. To reiterate, this sort of cancellation has nothing whatsoever to do with the data source, and should thus not be a method on the data source, but rather a cancellation token parameter passed to the copy operation. |
What would the I'm not sure how that proposal would change any of the MUSTs you're mentioning above. It's just a different API to perform the same thing (cancelling the body write). |
Assuming the goal of TimeoutBody is to cancel returning an HTTP response to the client after a certain amount of time elapsed regardless of anything else, there would be no TimeoutBody, since such a behavior has nothing to do with the response body. Instead, it should be configured globally for the HTTP server, or if per-route configuration is desired, then the route handler should return a structure containing the body and per-route server response configuration parameters such as a timeout (or perhaps a more general abstract "responder" that would take care of the operation of sending the response). Obviously however using a fixed timeout, especially if on by default, seems a really poor design since it will make it impossible to return large responses that could take hours or days to download. Instead, a more reasonable design would be to either have an timeout on each write, or a required minimum transfer bytes/second rate and a time interval to start enforcing it after. |
That is not the goal of the TimeoutBody. |
As described in hyperium/hyper#3121, this allows server implementations to abort body writes even if the client is not reading data.