-
Notifications
You must be signed in to change notification settings - Fork 39
batch node removal #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batch node removal #314
Conversation
pkg/controller/update_cluster.go
Outdated
| } | ||
|
|
||
| // findPodInstanceToRemove returns the pod (and associated placement instace) | ||
| // findPodInstancesToRemove returns the pod (and associated placement instace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: s/pod/pods
schallert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pkg/controller/update_cluster.go
Outdated
| } | ||
|
|
||
| // findPodInstanceToRemove returns the pod (and associated placement instace) | ||
| // findPodInstancesToRemove returns the pod (and associated placement instace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can you update the comment to reflect that this will return multiple pods / instances now?
| } | ||
|
|
||
| return nil, nil, errNoPodsInPlacement | ||
| return podsToRemove, instancesToRemove, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ensure this is doing what the user intends, do we want to return an error if len(podsToRemove) != removeCount? Not sure if there's a case where that could happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only case where this can happen is if we somehow want to remove more than exists. There is a test case when removeCount is for 4 but we have only 3 instances in placement.
In practice this should not happen because we remove only if placement contains more than desired:
if inPlacement > desired {
setLogger.Info("remove instance from placement for set")
return c.shrinkPlacementForSet(cluster, set, placement, int(inPlacement-desired))
}
. Maybe if we spam expand/shrink we could enter some weird state here. But I think it's hard to reason what will happen anyways.
Another approach that potentially can be safer is to instead of having "removeCount" we could pass in "desired" so that we try to reach target state, we could than assert if target state makes sense like "desired" > "currentPodCount" otherwise it looks like expansion.
|
Refactored PR to instead of using |
jeromefroe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTMx2, I like the new approach of just passing the desired number of instances to shrinkPlacementForSet better as well
| podsToRemove []*corev1.Pod | ||
| instancesToRemove []placement.Instance | ||
| ) | ||
| for i := len(podIDs) - 1; i >= desiredInstanceCount; i-- { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe it's worth moving the check that desiredInstanceCount is greater than zero into this function just in case we call it from other places in the future?
| if desiredInstanceCount < 0 { | ||
| msg := fmt.Sprintf("desired instance count is negative: %d", desiredInstanceCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we also validate against desiredInstanceCount = 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 0 should be a valid option. Like we can do with kubectl set scale 0. Haven't tried it though if it will work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can it be valid in terms of placement - where will the shards be relocated to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah maybe it does not make sense after all.
Adds support to shrink cluster by multiple instances