Skip to content

process hangs due to default socket_timeout=None #119

@zioproto

Description

@zioproto

This is a recurring issue, previously reported in this GitHub issue.

I encountered the same problem while testing a Valkey Cluster on Kubernetes. When a Valkey Cluster pod is evicted, my client attempts to reconnect immediately, but gets stuck in the SYN_SENT state indefinitely, trying to connect to an IP address that no longer exists in the Kubernetes cluster. In Kubernetes, pods are ephemeral, and each new pod is assigned a new IP address.

From the ValkeyCluster documentation, it's not immediately clear that developers need to configure the TCP timeout. See the relevant section here. Furthermore, I don’t believe developers building applications that connect to Valkey should have to handle low-level TCP parameters.

Proposed solutions:

  1. Set a default socket_timeout value

    • Pros: This would solve the problem immediately.
    • Cons: It may be difficult to choose a default value that fits all scenarios. For example, what would be a reasonable default value for Valkey traffic? Perhaps 2 seconds?
  2. Make socket_timeout a mandatory parameter

    • Pros: Avoids the need to guess a suitable default value.
    • Cons: This would introduce a breaking change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions