Don't assume the only error in getObject is an ArgumentError #47

Drvi · 2023-12-11T17:20:06Z

There are two main codepaths for get -- we either do a multipart download for larger files or a single GET call for smaller ones. When we give the method a preallocated buffer, we also constrain the maximum size of the object we want to download (anything bigger than the buffer should fail). When the object is larger than the buffer, the error handling is different for the two codepaths:

multipart download starts with a single HEAD call to learn the size of the object, and once we know it, we can compare it with the provided buffer, and throw an ArgumentError early.
For small files, we don't do the extra HEAD request, but we let HTTP.jl handle the case internally in the single GET call and if the buffer turned out to be too small, we'd get a RequestError wrapping and ArgumentError back from HTTP.jl.

This means that depending on the size of the input buffer, we'd sometimes get an ArgumentError and sometimes a RequestError both communicating the same underlying problem -- the input buffer is too small.

My current understanding is that to make these exception types more predictable, we always threw an additional ArgumentError in the small file case, assuming that buffer size mismatch was the only possible source of error at the call site, but we've seen that other kinds of errors are possible and that the ArgumentError was hiding the actual problem in those cases.

I also think that we used to do a HEAD request even for small files in the past and when we stopped doing that for performance reasons, we forgot that the HEAD was shielding the GET from various errors like 403, 404 and so on.

In this PR, we no longer throw an additional ArgumentError in the small file case, instead we inspect the RequestError and if it contains a single ArgumentError we rethrow that. This is done because it's backward compatible -- we wanted to throw an ArgumentError on buffer size mismatch before so we'll continue to do so. I'm not sure if this is the right call though, and would appreciate some feedback/discussion on that.

codecov · 2023-12-11T17:24:09Z

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (0229c4c) 84.21% compared to head (6b52c0b) 84.32%.

Files	Patch %	Lines
src/get.jl	92.30%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #47      +/-   ##
==========================================
+ Coverage   84.21%   84.32%   +0.11%     
==========================================
  Files           7        7              
  Lines         646      657      +11     
==========================================
+ Hits          544      554      +10     
- Misses        102      103       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/get.jl

quinnj

Yes, this makes sense to me. I'm in favor of throwing the same argument error; I also think it would be good to match the error messages, but dont' feel as strongly that we need to if it's easier not to.

Drvi · 2023-12-12T15:55:02Z

Thanks @quinnj! Would you prefer for the two error paths to report
a) What HTTP returns "Unable to grow response stream IOBuffer $(sizeof(out)) large enough for response body size: $(sizeof(in))"
or
b) What we currently do for multipart downloads "out ($(sizeof(out))) must at least be of length $(sizeof(in))"
?
Changing from a) to b) here would require some pattern matching on the message in the ArgumentError we intercepted from HTTP.jl, which I'm not the biggest fan of.

quinnj · 2023-12-12T16:38:38Z

Yeah, I think going w/ a is fine and easier, right?

Don't assume the only error in getObject is an ArgumentError

6f3d9b1

Drvi requested review from quinnj and nickrobinson251 December 11, 2023 17:27

Drvi commented Dec 11, 2023

View reviewed changes

src/get.jl Outdated Show resolved Hide resolved

quinnj approved these changes Dec 11, 2023

View reviewed changes

Use consistent error messages

6b52c0b

Drvi merged commit 45d303a into main Dec 13, 2023

Drvi deleted the td-tweak-error-handling branch December 13, 2023 13:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't assume the only error in getObject is an ArgumentError #47

Don't assume the only error in getObject is an ArgumentError #47

Drvi commented Dec 11, 2023 •

edited

Loading

codecov bot commented Dec 11, 2023 •

edited

Loading

quinnj left a comment

Drvi commented Dec 12, 2023

quinnj commented Dec 12, 2023

Don't assume the only error in getObject is an ArgumentError #47

Don't assume the only error in getObject is an ArgumentError #47

Conversation

Drvi commented Dec 11, 2023 • edited Loading

codecov bot commented Dec 11, 2023 • edited Loading

Codecov Report

quinnj left a comment

Choose a reason for hiding this comment

Drvi commented Dec 12, 2023

quinnj commented Dec 12, 2023

Drvi commented Dec 11, 2023 •

edited

Loading

codecov bot commented Dec 11, 2023 •

edited

Loading