Skip to content

Wait for zero in-flight requests before terminating realtime proxy #2402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 2, 2021

Conversation

deliahu
Copy link
Member

@deliahu deliahu commented Dec 1, 2021

No description provided.

@deliahu deliahu requested a review from RobertLucian December 1, 2021 18:46
@deliahu deliahu changed the title Wait for zero in-flight requests before terminating proxy Wait for zero in-flight requests before terminating realtime proxy Dec 1, 2021
Copy link
Collaborator

@miguelvr miguelvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can achieve the same without adding an endpoint. You can simply wait for in-flight requests when the proxy gets the SIGTERM signal and before calling server.Shutdown()

@miguelvr
Copy link
Collaborator

miguelvr commented Dec 1, 2021

I think we can achieve the same without adding an endpoint. You can simply wait for in-flight requests when the proxy gets the SIGTERM signal and before calling server.Shutdown()

Ref to the code:

cortex/cmd/proxy/main.go

Lines 172 to 189 in 16ffd08

select {
case err = <-errCh:
exit(log, errors.Wrap(err, "failed to start proxy server"))
case <-sigint:
// We received an interrupt signal, shut down.
log.Info("Received TERM signal, handling a graceful shutdown...")
for name, server := range servers {
log.Infof("Shutting down %s server", name)
if err = server.Shutdown(context.Background()); err != nil {
// Error from closing listeners, or context timeout:
log.Warnw("HTTP server Shutdown Error", zap.Error(err))
telemetry.Error(errors.Wrap(err, "HTTP server Shutdown Error"))
}
}
log.Info("Shutdown complete, exiting...")
}
}

@deliahu
Copy link
Member Author

deliahu commented Dec 1, 2021

@miguelvr yeah, that would be cleaner!

I just tried adding it there, but it didn't work as we expected. So I dug into it, and realized that we weren't handling the TERM signal (just INT). After listening for TERM too, checking the breaker's in-flight count was no longer necessary, since server.Shutdown() seems to handle this automatically.

What are your thoughts on the diff now?

@tfriedel
Copy link

tfriedel commented Dec 1, 2021

Thanks for the PR! I built the images and tried my test (route which sleeps 1 sec). Contrary to the previous verion, deleting a pod did not result in 503s. Good job!

@deliahu
Copy link
Member Author

deliahu commented Dec 1, 2021

@tfriedel awesome, thanks for bringing this to our attention, and for confirming that the fix is working as intended!

I'm working on a separate PR now to support custom preStop commands.

@deliahu deliahu merged commit d0b29f3 into master Dec 2, 2021
@deliahu deliahu deleted the proxy-pre-stop branch December 2, 2021 19:17
deliahu added a commit that referenced this pull request Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants