Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,15 @@ Endpoint. The pool has the following properties:
- **Rate-limited:** A Pool MUST limit the number of [Connections](#connection) being
[established](#establishing-a-connection-internal-implementation) concurrently via the **maxConnecting**
[pool option](#connection-pool-options).
- **Backoff-capable** A pool MUST be able to enter backoff mode. A pool will automatically enter backoff mode when a
connection checkout fails under conditions that indicate server overload. The rules for entering backoff mode are as
follows: - A network error or network timeout during the TCP handshake or the `hello` message for a new connection
MUST trigger the backoff state. - Other pending connections MUST not be canceled. - In the case of multiple pending
connections, the backoff attempt number MUST only be incremented once. This can be done by recording the state prior
Comment on lines +288 to +291
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking with @ShaneHarvey about this (related comments on drivers ticket). Shane's understanding is that we decided to include all timeout errors, regardless of where it originated, during connection establishment. Does that match your understanding, Steve?

And related; the design says:

After a connection establishment failure the pool enters the PoolBackoff state.

We should update the design with whatever the outcome of this thread is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a good callout, but it can't be from auth, since the auth spec explicitly calls out the timeout behavior. I'm assuming all drivers can distinguish between hello and auth since the are separate commands. I'll update to say if the driver can distinguish between TCP connect/DNS and the TLS handshake then it MUST do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which timeout behavior in the auth spec are you referring to? I searched for timeout and only saw stuff about what timeout values to use, but not how to handle network timeouts. Maybe I'm looking in the wrong place though.

I'm assuming all drivers can distinguish between hello and auth since the are separate commands.
If we decide to omit network errors during authentication, I think that's a fine assumption.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the specs must have gotten out of sync. I'm referring to:

https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/server-discovery-and-monitoring.md#why-mark-a-server-unknown-after-an-auth-error

"The Authentication spec requires that when authentication fails on a server, the driver MUST clear the server's connection pool"

to attempting the connection. While the Pool is in backoff, it exhibits the following behaviors: - **maxConnecting**
MUST be set to 1. - The Pool MUST wait for the backoff duration before another connection attempt. - A successful
heartbeat MUST NOT change the state of the pool. - A failed heartbeat MUST clear the pool. - A subsequent failed
connection MUST increase the backoff attempt. - A successful connection MUST return the Pool to ready state.

```typescript
interface ConnectionPool {
Expand Down Expand Up @@ -314,12 +323,17 @@ interface ConnectionPool {
* - "ready": The healthy state of the pool. It can service checkOut requests and create
* connections in the background. The pool can be set to this state via the
* ready() method.
*
* - "backoff": The pool is in backoff state. MaxConnecting is set to 1 and the pool backoff period
* must be observed before attempting another connection. A subsequent failed connection
* attempt increases the backoff duration. The pool can be set to this state via the
* backoff() method.
*
* - "closed": The pool is destroyed. No more Connections may ever be checked out nor any
* created in the background. The pool can be set to this state via the close()
* method. The pool cannot transition to any other state after being closed.
*/
state: "paused" | "ready" | "closed";
state: "paused" | "ready" | "backoff" | "closed";

// Any of the following connection counts may be computed rather than
// actually stored on the pool.
Expand Down Expand Up @@ -360,6 +374,11 @@ interface ConnectionPool {
*/
clear(interruptInUseConnections: Optional<Boolean>): void;

/**
* Enter backoff mode or increase backoff amount if already in backoff mode. Mark the pool as "backoff".
*/
backoff(): void

/**
* Mark the pool as "ready", allowing checkOuts to resume and connections to be created in the background.
* A pool can only transition from "paused" to "ready". A "closed" pool
Expand Down Expand Up @@ -829,6 +848,34 @@ interface PoolClearedEvent {
interruptInUseConnections: Optional<Boolean>;
}

/**
* Emitted when a Connection Pool is in backoff
*/
interface PoolBackoffEvent {
/**
* The ServerAddress of the Endpoint the pool is attempting to connect to.
*/
address: string;

/**
* The backoff attempt number.
*
* The incrementing backoff attempt number. This is included because
* the backoff duration is non-deterministic due to jitter.
*/
attempt: int64;

/**
* The duration the pool will not allow new connection establishments.
*
* A driver MAY choose the type idiomatic to the driver.
* If the type chosen does not convey units, e.g., `int64`,
* then the driver MAY include units in the name, e.g., `durationMS`.
*/
duration: Duration;
}


/**
* Emitted when a Connection Pool is closed
*/
Expand Down Expand Up @@ -1074,6 +1121,21 @@ placeholders as appropriate:

> Connection pool for {{serverHost}}:{{serverPort}} cleared for serviceId {{serviceId}}

#### Pool Backoff Message

In addition to the common fields defined above, this message MUST contain the following key-value pairs:

| Key | Suggested Type | Value |
| ---------- | -------------- | ---------------------------- |
| message | String | "Connection pool in backoff" |
| attempt | Int | The backoff attempt number. |
| durationMS | Int | Int32/Int64/Double |

The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in
placeholders as appropriate:

> Connection pool for {{serverHost}}:{{serverPort}} in backoff. Attempt: {{attempt}}. Duration: {{durationMS}} ms

#### Pool Closed Message

In addition to the common fields defined above, this message MUST contain the following key-value pairs:
Expand Down Expand Up @@ -1375,6 +1437,8 @@ to close and remove from its pool a [Connection](#connection) which has unread e

## Changelog

- 2025-XX-YY: Introduce "backoff" state.

- 2025-01-22: Clarify durationMS in logs may be Int32/Int64/Double.

- 2024-11-27: Relaxed the WaitQueue fairness requirement.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ Valid Unit Test Operations are the following:
- `interruptInUseConnections`: Determines whether "in use" connections should be also interrupted
- `pool.close()`: call `close` on Pool
- `pool.ready()`: call `ready` on Pool
- `pool.backoff()`: call `backoff` on Pool

## Integration Test Format

Expand All @@ -88,6 +89,8 @@ The integration test format is identical to the unit test format with the additi
- `maxServerVersion` (optional): The maximum server version (inclusive) against which the tests can be run
successfully. If this field is omitted, it should be assumed that there is no upper bound on the required server
version.
- `poolBackoff` (optional): If it is true, tests MUST only run if the driver supports backoff state in connection
pools. If it is false, tests MUST only run if the driver does not support backoff state in connection pools.
- `failPoint`: optional, a document containing a `configureFailPoint` command to run against the endpoint being used for
the test.
- `poolOptions.appName` (optional): appName attribute to be set in connections, which will be affected by the fail
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
version: 1
style: integration
description: pool enters backoff on connection close
runOn:
- minServerVersion: 4.9.0
failPoint:
configureFailPoint: failCommand
mode:
times: 1
data:
failCommands:
- isMaster
- hello
closeConnection: true
poolOptions:
minPoolSize: 0
operations:
- name: ready
- name: start
target: thread1
- name: checkOut
thread: thread1
- name: waitForEvent
event: ConnectionCreated
count: 1
- name: waitForEvent
event: ConnectionCheckOutFailed
count: 1
events:
- type: ConnectionCheckOutStarted
- type: ConnectionCreated
- type: ConnectionClosed
- type: ConnectionPoolBackoff
- type: ConnectionCheckOutFailed
ignore:
- ConnectionCheckedIn
- ConnectionCheckedOut
- ConnectionPoolCreated
- ConnectionPoolReady

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
version: 1
style: integration
description: backoff closes pending connections
runOn:
- minServerVersion: 4.9.0
poolBackoff: true
failPoint:
configureFailPoint: failCommand
mode: alwaysOn
data:
failCommands:
- isMaster
- hello
closeConnection: false
blockConnection: true
blockTimeMS: 10000
poolOptions:
minPoolSize: 0
operations:
- name: ready
- name: start
target: thread1
- name: checkOut
thread: thread1
- name: waitForEvent
event: ConnectionCreated
count: 1
- name: backoff
- name: waitForEvent
event: ConnectionCheckOutFailed
count: 1
events:
- type: ConnectionCheckOutStarted
- type: ConnectionCreated
- type: ConnectionPoolBackoff
- type: ConnectionClosed
- type: ConnectionCheckOutFailed
ignore:
- ConnectionCheckedIn
- ConnectionCheckedOut
- ConnectionPoolCreated
- ConnectionPoolReady
Loading
Loading