-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[Zen2] Introduce vote withdrawal #35446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Zen2] Introduce vote withdrawal #35446
Conversation
If shutting down half or more of the master-eligible nodes, their votes must first be explicitly withdrawn to ensure that the cluster doesn't lose its quorum. This works via _voting tombstones_, stored in the cluster state, which tell the reconfigurator to remove nodes from the voting configuration. This change introduces voting tombstones to the cluster state, together with transport APIs for adding and removing them, and makes use of these APIs in `InternalTestCluster` to support tests which remove at least half of the master-eligible nodes at once (e.g. shrinking from two master-eligible nodes to one).
|
Pinging @elastic/es-distributed |
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some smaller comments. Looks very good already.
...n/java/org/elasticsearch/action/admin/cluster/configuration/AddVotingTombstonesResponse.java
Outdated
Show resolved
Hide resolved
.../java/org/elasticsearch/action/admin/cluster/configuration/ClearVotingTombstonesRequest.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/action/admin/cluster/configuration/TransportAddVotingTombstonesAction.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/action/admin/cluster/configuration/TransportAddVotingTombstonesAction.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/action/admin/cluster/configuration/TransportAddVotingTombstonesAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/support/ActionFilters.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/ClusterState.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/ClusterState.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/routing/AllocationIdIT.java
Show resolved
Hide resolved
|
@elasticmachine test this please |
...org/elasticsearch/action/admin/cluster/configuration/TransportAddVotingTombstonesAction.java
Outdated
Show resolved
Hide resolved
test/framework/src/main/java/org/elasticsearch/test/InternalTestCluster.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think you need to merge latest zen2 branch here.
| final int nodesToRemain = internalCluster().size() - 1; | ||
| logger.info("--> reducing to [{}] nodes", nodesToRemain); | ||
| internalCluster().ensureAtMostNumDataNodes(nodesToRemain); | ||
| assertThat(internalCluster().size(), lessThanOrEqualTo(nodesToRemain)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you're testing here with this assertion. why not just repeatedly call stopRandomNode(), and maybe check that the cluster is alive and healthy before shutting down the last node.
|
CI failure was due to being a Centos worker (fixed by #35453) so I merged master to zen2 and thence to here. |
If shutting down half or more of the master-eligible nodes, their votes must
first be explicitly withdrawn to ensure that the cluster doesn't lose its
quorum. This works via voting tombstones, stored in the cluster state, which
tell the reconfigurator to remove nodes from the voting configuration.
This change introduces voting tombstones to the cluster state, together with
transport APIs for adding and removing them, and makes use of these APIs in
InternalTestClusterto support tests which remove at least half of themaster-eligible nodes at once (e.g. shrinking from two master-eligible nodes to
one).