Conversation
write() unconditionally called netWriteData.clear() after a non-blocking flushData() that may not have flushed all encrypted bytes; discarding unflushed TLS records, corrupting the encrypted stream and causing 'Broken pipe' or 'Connection reset' errors on subsequent writes (most commonly seen during 'gem push' over https of large artifacts) additionally, sysread needs to also flush pending writes before reading since after write_nonblock, encrypted bytes could remain unsent; without flushing first the server would never receive the complete request body (e.g. net/http POST), causing it to time out or reset
`readAndUnwrap()` called doHandshake(blocking) assuming `exception = true` When a post-handshake TLS event (e.g. TLS 1.3 NewSessionTicket) triggered handshake during a non-blocking read, waitSelect raised SSLErrorWaitReadable instead of returning :wait_readable.
- writeNonblockDataIntegrity: approximates the gem push scenario (#242) large payload via write_nonblock loop, then read server's byte count response, assert data integrity (no bytes lost) - writeNonblockNetWriteDataState: saturates TCP buffer, then accesses the package-private netWriteData field directly to verify buffer consistency after the compact() fix
headius
left a comment
There was a problem hiding this comment.
The batch of fixes looks pretty good after a first review pass. The tests are heinous but you know that. I suppose in the interest of getting fixes landed we can go with them for now but they shouldn't stay like this for long. It's very hard to follow what they're testing, or if they're actually testing what they should be.
| private void doWrap(final boolean blocking) throws IOException { | ||
| netWriteData.clear(); | ||
| SSLEngineResult result = engine.wrap(dummy, netWriteData); | ||
| SSLEngineResult result = engine.wrap(EMPTY_DATA.duplicate(), netWriteData); |
There was a problem hiding this comment.
Is it necessary to duplicate EMPTY_DATA? It is already zero-length and marked as read-only.
There was a problem hiding this comment.
shouldn't be, except engine.wrap in theory does set limit although likely not on an empty buffer.
| netWriteData.clear(); | ||
| try { | ||
| engine.wrap(dummy, netWriteData); // send close (after sslEngine.closeOutbound) | ||
| engine.wrap(EMPTY_DATA.duplicate(), netWriteData); // send close (after sslEngine.closeOutbound) |
| totalSent += ((RubyInteger) written).getLongValue(); | ||
| } catch (RaiseException e) { | ||
| if ("OpenSSL::SSL::SSLErrorWaitWritable".equals(e.getException().getMetaClass().getName())) { | ||
| System.out.println("syswrite_nonblock expected: " + e.getMessage()); |
There was a problem hiding this comment.
Shouldn't these be asserts or fails, so they show up as the cause of failure? stdout printlns can get lost in test harnesses.
There was a problem hiding this comment.
this part is mostly debug messages, the other branch fails.
they're useful esp. since a lot of time these exceptions do not have a trace.
| } catch (RaiseException e) { | ||
| String errorName = e.getException().getMetaClass().getName(); | ||
| if ("EOFError".equals(errorName) || "IOError".equals(errorName)) { // client closes connection | ||
| System.out.println("server-reader expected: " + e.getMessage()); |
There was a problem hiding this comment.
Again seems weird to have printlns in a test, even if it's running in a separate thread.
| if ("EOFError".equals(errorName) || "IOError".equals(errorName)) { // client closes connection | ||
| System.out.println("server-reader expected: " + e.getMessage()); | ||
| } else { | ||
| System.err.println("server-reader unexpected: " + e.getMessage()); |
another attempt at (minimal) fixing non-blocking write + read.
similar to #351 but wout test madness (tests here are still not great but we can revisit those later)
includes #347