-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix flaky test RemotePrimaryLocalRecoveryIT #18230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix flaky test RemotePrimaryLocalRecoveryIT #18230
Conversation
...rc/internalClusterTest/java/org/opensearch/remotemigration/RemotePrimaryLocalRecoveryIT.java
Show resolved
Hide resolved
|
❌ Gradle check result for a19b834: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Sandeep Kumawat <[email protected]>
a19b834 to
2491e5d
Compare
|
❌ Gradle check result for 2491e5d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Github check is failing |
|
@skumawat2025 Any update here? |
|
This PR is stalled because it has been open for 30 days with no activity. |
Description
Fix flaky RemotePrimaryLocalRecoveryIT by limiting rolling restarts to data nodes
Problem:
RemotePrimaryLocalRecoveryIT was failing intermittently due to ClusterManagerNotDiscoveredException during rolling restarts that included master nodes. Test failed within 100 iterations due to cluster manager discovery issues. The test only needs to verify remote migration local recovery after data node restarts.
Solution:
Modified rolling restart logic to only restart data nodes, excluding master nodes from the restart sequence. This change has proven stable across 800+ test iterations.
Related Issues
Resolves #14314
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.