Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't assume the only error in getObject is an ArgumentError #47

Merged
merged 2 commits into from
Dec 13, 2023

Conversation

Drvi
Copy link
Member

@Drvi Drvi commented Dec 11, 2023

There are two main codepaths for get -- we either do a multipart download for larger files or a single GET call for smaller ones. When we give the method a preallocated buffer, we also constrain the maximum size of the object we want to download (anything bigger than the buffer should fail). When the object is larger than the buffer, the error handling is different for the two codepaths:

  1. multipart download starts with a single HEAD call to learn the size of the object, and once we know it, we can compare it with the provided buffer, and throw an ArgumentError early.

  2. For small files, we don't do the extra HEAD request, but we let HTTP.jl handle the case internally in the single GET call and if the buffer turned out to be too small, we'd get a RequestError wrapping and ArgumentError back from HTTP.jl.

This means that depending on the size of the input buffer, we'd sometimes get an ArgumentError and sometimes a RequestError both communicating the same underlying problem -- the input buffer is too small.

My current understanding is that to make these exception types more predictable, we always threw an additional ArgumentError in the small file case, assuming that buffer size mismatch was the only possible source of error at the call site, but we've seen that other kinds of errors are possible and that the ArgumentError was hiding the actual problem in those cases.

I also think that we used to do a HEAD request even for small files in the past and when we stopped doing that for performance reasons, we forgot that the HEAD was shielding the GET from various errors like 403, 404 and so on.

In this PR, we no longer throw an additional ArgumentError in the small file case, instead we inspect the RequestError and if it contains a single ArgumentError we rethrow that. This is done because it's backward compatible -- we wanted to throw an ArgumentError on buffer size mismatch before so we'll continue to do so. I'm not sure if this is the right call though, and would appreciate some feedback/discussion on that.

Copy link

codecov bot commented Dec 11, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (0229c4c) 84.21% compared to head (6b52c0b) 84.32%.

Files Patch % Lines
src/get.jl 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #47      +/-   ##
==========================================
+ Coverage   84.21%   84.32%   +0.11%     
==========================================
  Files           7        7              
  Lines         646      657      +11     
==========================================
+ Hits          544      554      +10     
- Misses        102      103       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/get.jl Outdated Show resolved Hide resolved
Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this makes sense to me. I'm in favor of throwing the same argument error; I also think it would be good to match the error messages, but dont' feel as strongly that we need to if it's easier not to.

@Drvi
Copy link
Member Author

Drvi commented Dec 12, 2023

Thanks @quinnj! Would you prefer for the two error paths to report
a) What HTTP returns "Unable to grow response stream IOBuffer $(sizeof(out)) large enough for response body size: $(sizeof(in))"
or
b) What we currently do for multipart downloads "out ($(sizeof(out))) must at least be of length $(sizeof(in))"
?
Changing from a) to b) here would require some pattern matching on the message in the ArgumentError we intercepted from HTTP.jl, which I'm not the biggest fan of.

@quinnj
Copy link
Member

quinnj commented Dec 12, 2023

Yeah, I think going w/ a is fine and easier, right?

@Drvi Drvi merged commit 45d303a into main Dec 13, 2023
7 checks passed
@Drvi Drvi deleted the td-tweak-error-handling branch December 13, 2023 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants