-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[segment replication] Add cluster setting for retry timeout of publish checkpoint tx action #17749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[segment replication] Add cluster setting for retry timeout of publish checkpoint tx action #17749
Conversation
|
❌ Gradle check result for 1edc0ca: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…eckpointAction use the never give up strategy. Signed-off-by: guojialiang <[email protected]>
1edc0ca to
e49aa81
Compare
ashking94
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
❌ Gradle check result for b744f4b: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: guojialiang <[email protected]>
b744f4b to
68a5e9d
Compare
|
❌ Gradle check result for 68a5e9d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…Action_use_never_give_up_retry_strategy # Conflicts: # CHANGELOG.md
|
❌ Gradle check result for 3eb976e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Restarted the pr build. |
|
❌ Gradle check result for 3eb976e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…Action_use_never_give_up_retry_strategy
|
❌ Gradle check result for a3a23a7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: guojialiang <[email protected]>
…h checkpoint tx action (opensearch-project#17749) * TransportReplicationAction support specifying retryTimeout, PublishCheckpointAction use the never give up strategy. Signed-off-by: guojialiang <[email protected]> * support PublishCheckpointAction PUBLISH_CHECK_POINT_RETRY_TIMEOUT to override the default retry timeout Signed-off-by: guojialiang <[email protected]> * add TransportReplicationAction.getRetryTimeoutSetting Signed-off-by: guojialiang <[email protected]> * add entry to CHANGELOG.md Signed-off-by: guojialiang <[email protected]> * rewrite the PR title Signed-off-by: guojialiang <[email protected]> * modify changelog entry Signed-off-by: guojialiang <[email protected]> * add comments Signed-off-by: guojialiang <[email protected]> * update Signed-off-by: guojialiang <[email protected]> --------- Signed-off-by: guojialiang <[email protected]> Signed-off-by: Harsh Kothari <[email protected]>
…h checkpoint tx action (opensearch-project#17749) * TransportReplicationAction support specifying retryTimeout, PublishCheckpointAction use the never give up strategy. Signed-off-by: guojialiang <[email protected]> * support PublishCheckpointAction PUBLISH_CHECK_POINT_RETRY_TIMEOUT to override the default retry timeout Signed-off-by: guojialiang <[email protected]> * add TransportReplicationAction.getRetryTimeoutSetting Signed-off-by: guojialiang <[email protected]> * add entry to CHANGELOG.md Signed-off-by: guojialiang <[email protected]> * rewrite the PR title Signed-off-by: guojialiang <[email protected]> * modify changelog entry Signed-off-by: guojialiang <[email protected]> * add comments Signed-off-by: guojialiang <[email protected]> * update Signed-off-by: guojialiang <[email protected]> --------- Signed-off-by: guojialiang <[email protected]> Signed-off-by: Harsh Kothari <[email protected]>
Description
Added a test. In the current situation, if the primary shard publish checkpoint fails, it will cause the replica shard and the primary shard to fail to synchronize.
TransportReplicationActionsupport specifying retryTimeout.PublishCheckpointActionuse the never give up retry strategy.Related Issues
Resolves 17595
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.