-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Create and use new internal NodeInfo and PodInfo types to enable tracking DRA resources
#7390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
NodeInfo and PodInfo types to enable tracking DRA resources
466082f to
98b158d
Compare
|
/assign @MaciekPytel |
98b158d to
9a42f0d
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nojnhuh The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…eChecker This allows other components to interact with the Framework, which will be needed for DRA support later.
Methods to interact with the new internal types are added to ClusterSnapshot. Cluster Autoscaler code will be migrated to only use these methods and work on the internal types instead of directly using the framework types. The new types are designed so that they can be used exactly like the framework types, which should make the migration manageable. This allows easily adding additional data to the Nodes and Pods tracked in ClusterSnapshot, without having to change the scheduler framework. This will be needed to support DRA, as we'll need to track ResourceSlices and ResourceClaims.
The new types should behave like the direct schedulerframework types for most purposes, so most of the migration is just changing the imported package. Constructors look a bit different, so they have to be adapted - mostly in test code.
…ddNodeInfo We need AddNodeInfo in order to propagate DRA objects through the snapshot, which makes AddNodeWithPods redundant.
AddNodes() is redundant - it was indended for batch adding nodes, with batch-specific optimizations in mind probably. However, it has always been implemented as just iterating over AddNode(), and is only used in test code. Most of the uses in the test code were initialization - they are replaced with Initialize(), which will later be needed for handling DRA anyway. The other uses are replaced with inline loops over AddNode().
The method is already accessible via StorageInfos(), it's redundant.
AddNodeInfo already provides the same functionality, and has to be used in production code in order to propagate DRA objects correctly. Uses in production are replaced with Initialize(), which will later take DRA objects into account. Uses in the test code are replaced with AddNodeInfo().
simulator.BuildNodeInfoForNode, core_utils.GetNodeInfoFromTemplate, and scheduler_utils.DeepCopyTemplateNode all had very similar logic for sanitizing and copying NodeInfos. They're all consolidated to one file in simulator, sharing common logic. MixedTemplateNodeInfoProvider now correctly uses ClusterSnapshot to correlate Nodes to scheduled pods, instead of using a live Pod lister. This means that the snapshot now has to be properly initialized in a bunch of tests.
9a42f0d to
ce9e3fe
Compare
|
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Closing in favor of #7447 /close |
|
@nojnhuh: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
This PR adds new internal
NodeInfoandPodInfotypes independent from the ones fromk8s.io/kubernetes/pkg/scheduler. The new types will eventually be used to store information about ResourceSlices and ResourceClaims for Dynamic Resource Allocation (DRA).Which issue(s) this PR fixes:
Part of kubernetes/kubernetes#118612
Special notes for your reviewer:
These changes are based on the first commits of @towca's #7350 up to and including bb87555. I've made some small changes to avoid having to import the
k8s.io/api/resource/v1alpha3package yet which doesn't currently exist for the version of k8s.io in CA's go.mod.Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: