Skip to content

Read from response body is very slow #2549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tqwewe opened this issue May 13, 2021 · 10 comments
Closed

Read from response body is very slow #2549

tqwewe opened this issue May 13, 2021 · 10 comments

Comments

@tqwewe
Copy link

tqwewe commented May 13, 2021

I've written the following code in Hyper as a test:

// Start timer
let start = Instant::now();

// Make request
let client = Client::new();
let uri = "http://registry.yarnpkg.com/react".parse()?;
let mut resp = client.get(uri).await?;

// Print time elapsed
println!("{} ms", start.elapsed().as_millis());

And running it over and over I get:

70 ms
58 ms
72 ms
64 ms
102 ms
78 ms
AVERAGE: 74 ms

I then add to the bottom of the Make request section this one line:

while let Some(_chunk) = resp.body_mut().data().await { }

And my results come out as:

667 ms
578 ms
509 ms
437 ms
494 ms
433 ms
AVERAGE: 520 ms

For some reason reading from the response body takes on average 446 ms. Running the same tests in Python, it shows that this is way slower than it should be.

I've tested a bunch of HTTP libraries, and only chttp seems to be lightning fast as expected (like Python or any other HTTP client in another language). All the rest are slow, possibly because they are built on top of Hyper and CHTTP uses curl-sys.

There must be some kind of bug with reading the body of a HTTP request. I'm not sure why else it could be so slow.

@tqwewe tqwewe changed the title Read from response body is crazy slow Read from response body is very slow May 13, 2021
@seanmonstar
Copy link
Member

Interesting. Could you include a few more details? What version of hyper is used? Are you compiling with --release? What are morre characteristics of the body from the server?

@tqwewe
Copy link
Author

tqwewe commented May 13, 2021

I am running with --release. And the program just sends 2 requests in main. I manually ran each version 3 times and calculated the average manually.

My Hyper version is 0.14.7.

I am on Mac, but Windows users also experience the issue (a friend tested for me).
In my example I am making the request from registry.yarnpkg.com which returns a JSON body, but this issue persists no matter the host it seems. Tested with yarn registry, NPM registry, google.com.

My full main.rs file is here: https://gist.github.com/Acidic9/2af00500f51b2237fbf23591142a06ab

Making the request to http://www.google.com.au shows:
cargo run --release 39.76s user 0.72s system 603% cpu 6.708 total
cargo run --release 0.13s user 0.04s system 40% cpu 0.405 total
cargo run --release 0.13s user 0.04s system 42% cpu 0.378 total

In comparison, curl:
curl http://www.google.com.au 0.00s user 0.00s system 5% cpu 0.184 total
curl http://www.google.com.au 0.00s user 0.00s system 4% cpu 0.202 total
curl http://www.google.com.au 0.01s user 0.01s system 5% cpu 0.195 total

Using time cli command.
Also note that curl is printing the body, whereas my main function is only making a single request and not even printing to stdout.

I think the following issues may be related:
#2135 Slow reading of small chunks
#1813 Request bodies are processed slower over HTTP2

@seanmonstar
Copy link
Member

seanmonstar commented May 13, 2021

I wonder if there's something else related. So, I've been using hyper's examples/client, which does the same thing curl is doing.

Here's what I got:

$ time ./target/release/examples/client http://www.google.com.au >/dev/null
real 0m0.074s user 0m0.003s sys 0m0.005s

$ time ./target/release/examples/client http://www.google.com.au >/dev/null
real 0m0.075s user 0m0.010s sys 0m0.002s

$ time ./target/release/examples/client http://www.google.com.au >/dev/null
real 0m0.070s user 0m0.006s sys 0m0.001s

And:

$ time curl -s http://www.google.com.au >/dev/null
real 0m0.076s user 0m0.007s sys 0m0.007s

$ time curl -s http://www.google.com.au >/dev/null
real 0m0.074s user 0m0.012s sys 0m0.002s

$ time curl -s http://www.google.com.au >/dev/null
real 0m0.072s user 0m0.007s sys 0m0.001s

@sfackler
Copy link
Contributor

Are you sure the 39 seconds reported by your first call wasn't from compiling the program?

@tqwewe
Copy link
Author

tqwewe commented May 13, 2021

Are you sure the 39 seconds reported by your first call wasn't from compiling the program?

Yeah it must've been, I didn't notice that. Ignore that one I guess.

@nox nox closed this as completed May 17, 2021
@tqwewe
Copy link
Author

tqwewe commented May 17, 2021

This isn't considered an issue @nox ?

@nox
Copy link
Contributor

nox commented May 17, 2021

This isn't considered an issue @nox ?

Been reading the issue too quick and thought the main issue was those 39 seconds of build time, sorry, reopening.

@nox nox reopened this May 17, 2021
@seanmonstar
Copy link
Member

So, coming back to this, when I run the examples that do the same thing, they have nearly identical timings. I notice that the example you linked includes a for loop to do the request twice, while comparing that to invoking curl which does the request once. Is that perhaps it?

@oronsh
Copy link

oronsh commented Oct 21, 2021

Same thing here, I'm wondering if this is why the reqwest crate is using get_bytes() rather then aggregate() for deserializing json.
I have a case where I have a very large response body which needs to be deserialized.
problem is that with get_bytes() memory is not recycled for some reason and aggregate() is too small and cannot handle very large payloads for me.

@seanmonstar
Copy link
Member

I believe that is unrelated to the original issue here. Since it's gone quiet, I'm going to close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants