You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Darwin UA-KJP26G976P 23.4.0 Darwin Kernel Version 23.4.0: Wed Feb 21 21:44:54 PST 2024; root:xnu-10063.101.15~2/RELEASE_ARM64_T6030 arm64
Subsystem
http2
What steps will reproduce the bug?
Start a simple server that imitates a graceful shutdown by sending GOAWAY every N requests.
consthttp2=require('http2');constserver=http2.createServer({peerMaxConcurrentStreams: 100000,settings: {maxConcurrentStreams: 100000,},maxSessionMemory: 200,});constN=97;letcount=0;constmaxLastStreamId=2**31-1;server.on('stream',(stream)=>{count++;stream.respond({'content-type': 'text/html',':status': 200});stream.end('Hello World '+count);if(count%N===0){console.log('sending goaway frame at count:',count);stream.session.goaway(0,maxLastStreamId);}stream.on('error',(err)=>{console.log('error:',err);});});server.listen(8000);
then a client:
consthttp2=require('http2');constpayload=Buffer.alloc(1024*10,'a');letclient;functionconnect(){client=http2.connect('http://localhost:8000',{maxSessionMemory: 200,peerMaxConcurrentStreams: 100000,settings: {maxConcurrentStreams: 100000,},});client.on('goaway',()=>{connect();});client.on('error',(err)=>{console.log('client received error:',err);connect();});}consterrorStats=newMap();functionaddError(err){constkey=err.toString();constcount=errorStats.get(key)||0;errorStats.set(key,count+1);}functiondumpErrorStatistics(){for(const[key,value]oferrorStats){console.log('error:',key,'count:',value);}errorStats.clear();}constMAX_IN_FLIGHT=67;functionsendRequests(){letinFlight=0;while(inFlight<MAX_IN_FLIGHT){try{conststream=client.request({':path': '/',':method': 'POST','content-type': 'text/html',});inFlight++;stream.on('error',(err)=>{addError(err);});stream.on('data',()=>{});stream.on('close',()=>{});stream.write(payload);stream.end();}catch(err){addError(err);}}dumpErrorStatistics();}connect();setInterval(sendRequests,7);
How often does it reproduce? Is there a required condition?
Reproduces every few seconds on my machine. The specific values of parameters (the number of requests between GOAWAY frames, the request batch size, interval between batches) may require adjustment on another machine.
What is the expected behavior? Why is that the expected behavior?
The server is sending GOAWAY with last stream id = 2^31 - 1, which should allow completing the existing requests.
This comment states that the existing streams should complete successfully, while the pending streams should be cancelled.
I would expect to never see NGHTTP2_INTERNAL_ERROR on the client in this situation.
Also note that the client immediately reconnects after getting a goaway event, so it seems that the client doesn't break the http2 module contract.
What do you see instead?
Some client streams are closed with code NGHTTP2_INTERNAL_ERROR, as seen in the client.js output:
The main issue is the client code receiving internal error instead of the cancelled error, making it impossible to retry the request in general case.
Another artifact I'm seeing is the client attempts sending a RST_STREAM with code 7 (REFUSED_STREAM) to server although the stream is initiated on client and server has never seen this stream. We verified this by looking at the traffic dump, the client is actually sending RST_STREAM for a new(!) client stream to server after getting a GOAWAY frame.
Additional information
This is a synthetic minimal reproduction of a real bug we're seeing in production when the gRPC client cannot handle a graceful shutdown of a connection to Envoy proxy.
According to grpc-node maintainer, the error code returned by node is too general for gRPC client to handle it gracefully: grpc/grpc-node#2625 (comment)
The text was updated successfully, but these errors were encountered:
alexeevg
changed the title
Some http2 streams are closed with error code NGHTTP2_INTERNAL_ERROR after receiving GOAWAY frame
Some client http2 streams are closed with error code NGHTTP2_INTERNAL_ERROR after receiving GOAWAY frame
Nov 17, 2024
Version
v23.2.0 (also reproduces on v18.20.4, v20.17.0)
Platform
Subsystem
http2
What steps will reproduce the bug?
Start a simple server that imitates a graceful shutdown by sending GOAWAY every N requests.
then a client:
How often does it reproduce? Is there a required condition?
Reproduces every few seconds on my machine. The specific values of parameters (the number of requests between GOAWAY frames, the request batch size, interval between batches) may require adjustment on another machine.
What is the expected behavior? Why is that the expected behavior?
The server is sending GOAWAY with last stream id = 2^31 - 1, which should allow completing the existing requests.
This comment states that the existing streams should complete successfully, while the pending streams should be cancelled.
I would expect to never see
NGHTTP2_INTERNAL_ERROR
on the client in this situation.Also note that the client immediately reconnects after getting a
goaway
event, so it seems that the client doesn't break thehttp2
module contract.What do you see instead?
Some client streams are closed with code
NGHTTP2_INTERNAL_ERROR
, as seen in theclient.js
output:If we run the client with
NODE_DEBUG=http2
, we can see an attempt to send headers of the new stream after goaway was already processed:The main issue is the client code receiving
internal error
instead of thecancelled
error, making it impossible to retry the request in general case.Another artifact I'm seeing is the client attempts sending a
RST_STREAM
with code 7 (REFUSED_STREAM) to server although the stream is initiated on client and server has never seen this stream. We verified this by looking at the traffic dump, the client is actually sendingRST_STREAM
for a new(!) client stream to server after getting aGOAWAY
frame.Additional information
This is a synthetic minimal reproduction of a real bug we're seeing in production when the gRPC client cannot handle a graceful shutdown of a connection to Envoy proxy.
According to
grpc-node
maintainer, the error code returned by node is too general for gRPC client to handle it gracefully: grpc/grpc-node#2625 (comment)The text was updated successfully, but these errors were encountered: