rpc: reset writeConn when conn is closed on readErr #20414

zzy96 · 2019-12-01T01:36:58Z

Behavior

When running geth with external signer (Clef), if Clef is restarted, the first call to Clef always fail but the subsequent calls work fine.

Analysis & Fix

In rpc/client.go, when any error like EOF is sent to readErr channel, conn will be closed but c.writeConn is not set to nil. If a new request comes after conn is closed, it must fail once before it will try to reconnect. This fix is to set c.writeConn = nil when readErr is received and conn is closed.

fjl · 2019-12-01T10:50:55Z

I'm not sure setting writeConn is safe in this context. This might introduce a race. Will check tomorrow.

fjl · 2019-12-01T12:00:27Z

Code docs say:

// writeConn is used for writing to the connection on the caller's goroutine. It should
// only be accessed outside of dispatch, with the write lock held. The write lock is
// taken by sending on requestOp and released by sending on sendDone.

This comment is slightly out of date because requestOp has since been renamed to reqInit.
Still means we cannot set this field from inside the dispatch loop.

writeConn is closed when a read error happens. I think the best we can do to avoid the extra error on the next call is handling closed connections better on the write code path.

zzy96 · 2019-12-02T12:39:58Z

Hi @fjl

I have tried another fix adding a retry flag for c.write.

fjl

Sorry, this took a long time to get reviewed because I was busy with other things. I have tested the change and it solves the issue.

fjl · 2020-01-27T13:04:17Z

Damn, I didn't change the commit message title. The title is a bit misleading because the fix is now based on the write error.

This change makes the client attempt to reconnect when a write fails. We already had reconnect support, but the reconnect would previously happen on the next call after an error. Being more eager leads to a smoother experience overall.

set writeConn to nil when conn is closed on readErr

f0febb6

zzy96 requested review from fjl and holiman as code owners December 1, 2019 01:36

fjl changed the title ~~Set writeConn to nil when conn is closed on readErr~~ rpc: reset writeConn when conn is closed on readErr Dec 1, 2019

reset writeConn in write instead of dispatch

13f0606

adamschmideg added the status:triage label Dec 17, 2019

adamschmideg assigned fjl Jan 14, 2020

adamschmideg added rpc and removed status:triage labels Jan 14, 2020

fjl approved these changes Jan 27, 2020

View reviewed changes

fjl merged commit 44c365c into ethereum:master Jan 27, 2020

zzy96 deleted the fix-rpc-client-call-fail-on-restart branch January 28, 2020 12:52

holiman added this to the 1.9.11 milestone Feb 3, 2020

ricardolyn mentioned this pull request Feb 3, 2021

[Upgrade] Go-Ethereum release v1.9.11 Consensys/quorum#1121

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rpc: reset writeConn when conn is closed on readErr #20414

rpc: reset writeConn when conn is closed on readErr #20414

Uh oh!

zzy96 commented Dec 1, 2019

Uh oh!

fjl commented Dec 1, 2019

Uh oh!

fjl commented Dec 1, 2019 •

edited

Loading

Uh oh!

zzy96 commented Dec 2, 2019

Uh oh!

fjl left a comment

Uh oh!

fjl commented Jan 27, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rpc: reset writeConn when conn is closed on readErr #20414

rpc: reset writeConn when conn is closed on readErr #20414

Uh oh!

Conversation

zzy96 commented Dec 1, 2019

Behavior

Analysis & Fix

Uh oh!

fjl commented Dec 1, 2019

Uh oh!

fjl commented Dec 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zzy96 commented Dec 2, 2019

Uh oh!

fjl left a comment

Choose a reason for hiding this comment

Uh oh!

fjl commented Jan 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fjl commented Dec 1, 2019 •

edited

Loading

fjl commented Jan 27, 2020 •

edited

Loading