-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hang in PreClose method during dup2 system call #605
base: openj9
Are you sure you want to change the base?
Conversation
d863fa7
to
34cdd7b
Compare
@pshipton @keithc-ca Can you please review this PR |
Pls update the original source rather than copying it to the closed directory and modifying it. |
7bf6291
to
bbc3c5d
Compare
nd.preClose(fd); | ||
NativeThread.signal(th); | ||
nd.preClose(fd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a much more detailed description of why the original order is wrong and why it should apply to all "unix" platforms. Further, the same order (preClose then signal) also exists in each of these files
src/java.base/share/classes/sun/nio/ch/DatagramChannelImpl.java
src/java.base/share/classes/sun/nio/ch/ServerSocketChannelImpl.java
src/java.base/share/classes/sun/nio/ch/SocketChannelImpl.java
src/java.base/unix/classes/sun/nio/ch/SinkChannelImpl.java
src/jdk.sctp/unix/classes/sun/nio/ch/sctp/SctpChannelImpl.java
src/jdk.sctp/unix/classes/sun/nio/ch/sctp/SctpMultiChannelImpl.java
src/jdk.sctp/unix/classes/sun/nio/ch/sctp/SctpServerChannelImpl.java
Please either correct each of those instances or explain why they don't need similar treatment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think you've convinced me that reordering those calls makes sense. Other threads that might retry their read or write operation will fail because the channel has been closed (notice assert !isOpen();
on line 823.
However, you haven't explained why the same pattern that exists in the classes I listed above doesn't need correcting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keithc-ca
The changes are applicable to the files you have listed. I have made the changes to those files
During fcntl, dup2 system calls if read/write happens then PreClose method gets hang. This issue is not seen if we Signal/Kill the thread first and then call the preClose method. Signed-off-by: Shruthi.Shruthi1 <[email protected]>
bbc3c5d
to
20217ef
Compare
@keithc-ca This is a known issue on AIX (and z/OS) since Java 6 and the solution mentioned above has been adopted in IBM Java since then. We wanted to take this solution to OpenJDK, but there we require a Java reproducer test. Unfortunately, we haven't been able to reproduce the issue using a simple Java test case. This has been reported by an IBM customer who is able to reproduce the issue consistently on an AIX machine, but only with the IBM Integration Server running with the Terracotta server. Without a simple Java reproducer, OpenJDK is unlikely to accept a bug report and hence a PR. So, we want to propose this solution to the extensions repo. |
@keithc-ca Can you please review the changes made to the files you have listed |
I think we need to tread carefully here. I think we need to consider whether the change here will make the problem reported in eclipse-openj9/openj9#12582 more likely. @pshipton, do you have any thoughts or opinions on that? |
We can run grinders on the test with this change. Rather than assuming we won't get anything from OpenJDK, we should take this to OpenJDK even without a testcase. Even if we can't get the change made at OpenJDK, we should try, and we may get some useful information. I'm also thinking that if the problem only affects AIX and z/OS, the changes should be restricted to those platforms rather than modify other platforms and potentially cause a regression or change in behavior there. |
During fcntl, dup2 system calls if read/write happens then PreClose method gets hang. This issue is not seen if we Signal/Kill the thread first and then call the preClose method.
Signed-off-by: Shruthi.Shruthi1 [email protected]