-
Notifications
You must be signed in to change notification settings - Fork 21.5k
rpc: reset writeConn when conn is closed on readErr #20414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpc: reset writeConn when conn is closed on readErr #20414
Conversation
|
I'm not sure setting writeConn is safe in this context. This might introduce a race. Will check tomorrow. |
|
Code docs say: // writeConn is used for writing to the connection on the caller's goroutine. It should
// only be accessed outside of dispatch, with the write lock held. The write lock is
// taken by sending on requestOp and released by sending on sendDone.This comment is slightly out of date because
|
|
Hi @fjl I have tried another fix adding a |
fjl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this took a long time to get reviewed because I was busy with other things. I have tested the change and it solves the issue.
|
Damn, I didn't change the commit message title. The title is a bit misleading because the fix is now based on the write error. |
This change makes the client attempt to reconnect when a write fails. We already had reconnect support, but the reconnect would previously happen on the next call after an error. Being more eager leads to a smoother experience overall.
Behavior
When running geth with external signer (Clef), if Clef is restarted, the first call to Clef always fail but the subsequent calls work fine.
Analysis & Fix
In
rpc/client.go, when any error likeEOFis sent toreadErrchannel,connwill be closed butc.writeConnis not set to nil. If a new request comes afterconnis closed, it must fail once before it will try to reconnect. This fix is to setc.writeConn = nilwhenreadErris received andconnis closed.