-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support long marked offset in SlicedInputStream #113629
Support long marked offset in SlicedInputStream #113629
Conversation
Else, a reset might reset to an unintended offset in a big slice. Relates ES-9639
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix looks good, I left some suggestions regarding the test.
server/src/test/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStreamTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStreamTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, left a couple comments in addition to David's.
server/src/main/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStream.java
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStreamTests.java
Outdated
Show resolved
Hide resolved
…e/ES-9639-fix-sliced-mark-long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @DaveCTurner , @henningandersen ! Handled, feel free to review again.
server/src/test/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStreamTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStreamTests.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStream.java
Show resolved
Hide resolved
if (currentStream == null) { | ||
// In case EOF has been reached, we set the currentStream to a non-null value so that nextStream() does not complain. | ||
currentStream = InputStream.nullInputStream(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to do this dance just to make nextStream()
happy? It seems as if we could just set currentStream
to the right slice directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comment made me thinking and found another non-intrusive way: extend the assertion in nextStream(). Feel free to review!
} | ||
final long skipped = stream.skip(remaining); | ||
currentSliceOffset += skipped; | ||
if (skipped == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm skipped == 0
doesn't necessarily mean we reached end-of-stream. If we haven't quite skipped enough bytes we should try and read()
one more byte and only move to the next stream if that returns -1
.
I think we also shouldn't keep retrying the skip()
if it falls short (unless it reached the end of the stream). It's up to the caller to check the return value and retry if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point! Added a read() to ensure we're reaching stream EOF.
I think we also shouldn't keep retrying the skip() if it falls short (unless it reached the end of the stream). It's up to the caller to check the return value and retry if needed.
I'm split on this. Certainly the caller should retry, but I also see implementations with loops. E.g., the default JDK implementation has a loop over read()
, instead of a single call to read()
. I also feel it's "safer" in case the caller forgets the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah in practice most impls of skip
do try and skip all the way, just seems like if one doesn't then it might be useful to propagate that to the caller.
I don't think we want to check skipped == 0
at all tbh - that involves an extra iteration at the end of each stream. Can we just check for EOF if skipped < remaining
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, totally makes it better, done!
server/src/test/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStreamTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DaveCTurner ! Feel free to review again.
} | ||
final long skipped = stream.skip(remaining); | ||
currentSliceOffset += skipped; | ||
if (skipped == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point! Added a read() to ensure we're reaching stream EOF.
I think we also shouldn't keep retrying the skip() if it falls short (unless it reached the end of the stream). It's up to the caller to check the return value and retry if needed.
I'm split on this. Certainly the caller should retry, but I also see implementations with loops. E.g., the default JDK implementation has a loop over read()
, instead of a single call to read()
. I also feel it's "safer" in case the caller forgets the loop.
if (currentStream == null) { | ||
// In case EOF has been reached, we set the currentStream to a non-null value so that nextStream() does not complain. | ||
currentStream = InputStream.nullInputStream(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comment made me thinking and found another non-intrusive way: extend the assertion in nextStream(). Feel free to review!
server/src/test/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStreamTests.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStream.java
Show resolved
Hide resolved
} | ||
final long skipped = stream.skip(remaining); | ||
currentSliceOffset += skipped; | ||
if (skipped == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah in practice most impls of skip
do try and skip all the way, just seems like if one doesn't then it might be useful to propagate that to the caller.
I don't think we want to check skipped == 0
at all tbh - that involves an extra iteration at the end of each stream. Can we just check for EOF if skipped < remaining
?
server/src/main/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStream.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DaveCTurner ! Feel free to review again.
server/src/main/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStream.java
Show resolved
Hide resolved
} | ||
final long skipped = stream.skip(remaining); | ||
currentSliceOffset += skipped; | ||
if (skipped == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, totally makes it better, done!
server/src/main/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStream.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just a few more comments (besides David's).
server/src/test/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStreamTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStreamTests.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStream.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @henningandersen , @DaveCTurner ! Feel free to review again.
server/src/main/java/org/elasticsearch/index/snapshots/blobstore/SlicedInputStream.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (one question but it doesn't especially matter either way)
// the skip is performed on the marked slice and no other slices are involved. This may help uncover any bugs. | ||
currentStream.skipNBytes(markedSliceOffset); | ||
} else { | ||
currentStream = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this else
branch happen? Isn't markedSlice < numSlices
an invariant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can happen in the case one marks after reaching EOF. Added a test and made sure it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
CI failure is #113694 , retrying. |
💚 Backport successful
|
Else, a reset might reset to an unintended offset in a big slice. Also add support for skip. Relates ES-9639
Else, a reset might reset to an unintended offset in a big slice. Also add support for skip. Relates ES-9639
Else, a reset might reset to an unintended offset in a big slice.
Relates ES-9639