Skip to content

Commit 0433850

Browse files
Add numaid and cpus into PodResources interface
This change necessary for resource with topology exporting daemon, which used in topology aware scheduling. Information about CPU is keeping in cpu_ids, since it's enough to represent both quantity and numaid. NUMAid can be obtained from cadvisor MachineInfo, since id in cpus_ids is a thread_id. This API doesn't provide cpu fraction, since it could be obtainded from Pod's request/limits and in case of non-integer CPU quantity and non-guaranteed QoS cpu assigned is not exclusive and NUMA id is not interesting. Signed-off-by: Alexey Perevalov <[email protected]>
1 parent 3b6b4eb commit 0433850

File tree

1 file changed

+14
-5
lines changed

1 file changed

+14
-5
lines changed

keps/sig-node/compute-device-assignment.md

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ approvers:
1313
- "@sig-node-leads"
1414
editor: "@dashpole"
1515
creation-date: "2018-07-19"
16-
last-updated: "2019-04-30"
16+
last-updated: "2020-07-06"
1717
status: implementable
1818
---
19-
# Kubelet endpoint for device assignment observation details
19+
# Kubelet endpoint for device assignment observation details
2020

2121
## Table of Contents
2222

@@ -26,6 +26,8 @@ status: implementable
2626
- [Objectives](#objectives)
2727
- [User Journeys](#user-journeys)
2828
- [Device Monitoring Agents](#device-monitoring-agents)
29+
- [Device aware CNI plugin](#device-aware-cni-plugin)
30+
- [Topology aware scheduling](#topology-aware-scheduling)
2931
- [Changes](#changes)
3032
- [Potential Future Improvements](#potential-future-improvements)
3133
- [Alternatives Considered](#alternatives-considered)
@@ -57,10 +59,15 @@ In this document we will discuss the motivation and code changes required for in
5759

5860
![device monitoring architecture](https://user-images.githubusercontent.com/3262098/43926483-44331496-9bdf-11e8-82a0-14b47583b103.png)
5961

62+
### Device aware CNI plugin
63+
After this interface has been introduced it was used by CNI plugins like [kuryr-kubernetes](https://review.opendev.org/#/c/651580/) in couple with [intel-sriov-device-plugin](https://github.com/intel/sriov-network-device-plugin) to correctly define which devices were assigned to the pod.
64+
65+
### Topology aware scheduling
66+
This interface can be used to collect allocated resources with information about the NUMA topology of the worker node. This information can then be used in NUMA aware scheduling.
6067

6168
## Changes
6269

63-
Add a v1alpha1 Kubelet GRPC service, at `/var/lib/kubelet/pod-resources/kubelet.sock`, which returns information about the kubelet's assignment of devices to containers. It obtains this information from the internal state of the kubelet's Device Manager. The GRPC Service returns a single PodResourcesResponse, which is shown in proto below:
70+
Add a v1alpha1 Kubelet GRPC service, at `/var/lib/kubelet/pod-resources/kubelet.sock`, which returns information about the kubelet's assignment of devices and cpus to containers with NUMA id. It obtains this information from the internal state of the kubelet's Device Manager and CPU Manager respectively. The GRPC Service returns a single PodResourcesResponse, which is shown in proto below:
6471
```protobuf
6572
// PodResources is a service provided by the kubelet that provides information about the
6673
// node resources consumed by pods and containers on the node
@@ -87,12 +94,14 @@ message PodResources {
8794
message ContainerResources {
8895
string name = 1;
8996
repeated ContainerDevices devices = 2;
97+
repeated uint32 cpu_ids = 3;
9098
}
9199
92100
// ContainerDevices contains information about the devices assigned to a container
93101
message ContainerDevices {
94102
string resource_name = 1;
95103
repeated string device_ids = 2;
104+
uint32 numaid = 3;
96105
}
97106
```
98107

@@ -113,7 +122,7 @@ message ContainerDevices {
113122
* Notes:
114123
* Does not include any reference to resource names. Monitoring agentes must identify devices by the device or environment variables passed to the pod or container.
115124

116-
### Add a field to Pod Status.
125+
### Add a field to Pod Status.
117126
* Pros:
118127
* Allows for observation of container to device bindings local to the node through the `/pods` endpoint
119128
* Cons:
@@ -148,7 +157,7 @@ type Container struct {
148157
}
149158
```
150159
* During Kubelet pod admission, if `ComputeDevices` is found non-empty, specified devices will be allocated otherwise behaviour will remain same as it is today.
151-
* Before starting the pod, the kubelet writes the assigned `ComputeDevices` back to the pod spec.
160+
* Before starting the pod, the kubelet writes the assigned `ComputeDevices` back to the pod spec.
152161
* Note: Writing to the Api Server and waiting to observe the updated pod spec in the kubelet's pod watch may add significant latency to pod startup.
153162
* Allows devices to potentially be assigned by a custom scheduler.
154163
* Serves as a permanent record of device assignments for the kubelet, and eliminates the need for the kubelet to maintain this state locally.

0 commit comments

Comments
 (0)