|
| 1 | +# Kube-State-Metrics - Simplify Custom Resource State Metrics API Proposal |
| 2 | + |
| 3 | + |
| 4 | +--- |
| 5 | + |
| 6 | +Author: Catherine Fang (CatherineF-dev@), Han Kang (logicalhan@) |
| 7 | + |
| 8 | +Date: 7. May 2023 |
| 9 | + |
| 10 | +Target release: v |
| 11 | + |
| 12 | +--- |
| 13 | + |
| 14 | + |
| 15 | +## Glossary |
| 16 | +- CR: custom resource, similar to an instance of a class |
| 17 | +- CRD: custom resource definition, similar to a class |
| 18 | + |
| 19 | +## Problem Statement |
| 20 | + |
| 21 | +### Background |
| 22 | +Current [Custom Resource State Metrics](https://github.com/kubernetes/kube-state-metrics/blob/main/docs/customresourcestate-metrics.md#multiple-metricskitchen-sink) supports 8+ operations to extract metric value and labels from custom resource. |
| 23 | +- each |
| 24 | +- path |
| 25 | +- labelFromKey |
| 26 | +- labelsFromPath |
| 27 | +- valueFrom |
| 28 | +- commonLabels |
| 29 | +- labelsFromPath |
| 30 | +- *. |
| 31 | +- ... |
| 32 | + |
| 33 | +### Problem |
| 34 | +1. Custom resource metrics API isn't scalable and it's a little hard to maintain. |
| 35 | + 1.1 The maintaining work is O(8) and there are several bugs around these 8 operations. For example, Crash on nonexistent metric paths in custom resources (#1992). |
| 36 | + 1.2 More additional operations might be added to satisfy other needs. |
| 37 | +2. Custom resource metrics API with existing 8 operations is not complete, which means some cases aren't covered. For example, it doesn't support querying number of CRs under one CRD. |
| 38 | + |
| 39 | +## Goal |
| 40 | + |
| 41 | +- Simplify 8 operations into one operation to reduce maintaining work. |
| 42 | +- A complete API, so that can support more cases. For example, querying number of CRs under one CRD. |
| 43 | + |
| 44 | +## Proposal |
| 45 | + |
| 46 | +Use common expression language ([cel](https://kubernetes.io/docs/reference/using-api/cel/)) to extract fields from custom resource as metric labels or metric value. |
| 47 | + |
| 48 | + |
| 49 | +``` |
| 50 | +kind: CustomResourceStateMetricsV2 |
| 51 | +spec: |
| 52 | + resources: |
| 53 | + - groupVersionKind: |
| 54 | + group: myteam.io |
| 55 | + kind: "Foo" |
| 56 | + version: "v1" |
| 57 | + mode: for_loop # or merged |
| 58 | + metrics: |
| 59 | + - name: "ready_count" |
| 60 | + help: "Number Foo Bars ready" |
| 61 | + values: x.cel_selection_1 // [2, 4] |
| 62 | + labels: |
| 63 | + - x.cel_selection_2 // [{"cr_name": "bar"}], it will be copied into 2 same items |
| 64 | + - x.cel_selection_3 // [{active": 1}, {"active": 3}] |
| 65 | + - x.cel_selection_4 // [{"name": "type-a"}, {"name": "type-b"}] |
| 66 | +``` |
| 67 | + |
| 68 | +Mode has two options: |
| 69 | +- for_loop: it assigns x to each CR. |
| 70 | +- merged: it assigns x to the merged CR of all CRs under one CRD. x := {"cr_name_foo": cr1, "cr_name_bar": cr2, ...}. It can count number of CRs under one CRD. |
| 71 | + |
| 72 | +In this example (mode: for_loop), x is one CR under CRD (myteam.io/v1 Foo). |
| 73 | +Assume it has N CRs under this CRD, it will generate these metrics: |
| 74 | +- ready_count{cr_name=cr_1, active=1, name=type-a} = 2 |
| 75 | +- ready_count{cr_name=cr_1, active=3, name=type-b} = 4 |
| 76 | +- ... |
| 77 | +- ready_count{cr_name=cr_n, active=2, name=type-c} = 5 |
| 78 | +- ready_count{cr_name=cr_n, active=3, name=type-d} = 6 |
| 79 | + |
| 80 | +### Mapping between existing operations and CEL |
| 81 | +| :--- | :--- | |
| 82 | +| operation | CEL | |
| 83 | +| path: [status, sub] \n labelFromKey: type | x.status.sub.map(y, {"name": y}) | |
| 84 | +| path: [status, sub] \n valueFrom: [ready] | x.status.sub.map(y, x.status.sub[y].ready) | |
| 85 | +| commonLabels: \n custom_metric: "yes" | [{ "custom_metric":"yes" }] | |
| 86 | +| labelsFromPath: "*": [metadata, labels] | [x.metadata.labels] | |
| 87 | +| labelsFromPath \n foo: [metadata, labels, foo] | [{'name': x.metadata.name}] | |
| 88 | + |
| 89 | +## Example |
| 90 | +### CR |
| 91 | +``` |
| 92 | +kind: Foo |
| 93 | +apiVersion: myteam.io/vl |
| 94 | +metadata: |
| 95 | + annotations: |
| 96 | + bar: baz |
| 97 | + qux: quxx |
| 98 | + labels: |
| 99 | + foo: bar |
| 100 | + name: foo |
| 101 | +spec: |
| 102 | + version: v1.2.3 |
| 103 | + order: |
| 104 | + - id: 1 |
| 105 | + value: true |
| 106 | + - id: 3 |
| 107 | + value: false |
| 108 | + replicas: 1 |
| 109 | +status: |
| 110 | + phase: Pending |
| 111 | + active: |
| 112 | + type-a: 1 |
| 113 | + type-b: 3 |
| 114 | + conditions: |
| 115 | + - name: a |
| 116 | + value: 45 |
| 117 | + - name: b |
| 118 | + value: 66 |
| 119 | + sub: |
| 120 | + type-a: |
| 121 | + active: 1 |
| 122 | + ready: 2 |
| 123 | + type-b: |
| 124 | + active: 3 |
| 125 | + ready: 4 |
| 126 | + uptime: 43.21 |
| 127 | +``` |
| 128 | + |
| 129 | +### Existing API - CustomResourceStateMetrics |
| 130 | +``` |
| 131 | +kind: CustomResourceStateMetrics |
| 132 | +spec: |
| 133 | + resources: |
| 134 | + - groupVersionKind: |
| 135 | + group: myteam.io |
| 136 | + kind: "Foo" |
| 137 | + version: "v1" |
| 138 | + # labels can be added to all metrics from a resource |
| 139 | + commonLabels: |
| 140 | + crd_type: "foo" |
| 141 | + labelsFromPath: |
| 142 | + name: [metadata, name] |
| 143 | + metrics: |
| 144 | + - name: "ready_count" |
| 145 | + help: "Number Foo Bars ready" |
| 146 | + each: |
| 147 | + type: Gauge |
| 148 | + gauge: |
| 149 | + # targeting an object or array will produce a metric for each element |
| 150 | + # labelsFromPath and value are relative to this path |
| 151 | + path: [status, sub] |
| 152 | +
|
| 153 | + # if path targets an object, the object key will be used as label value |
| 154 | + # This is not supported for StateSet type as all values will be truthy, which is redundant. |
| 155 | + labelFromKey: type |
| 156 | + # label values can be resolved specific to this path |
| 157 | + labelsFromPath: |
| 158 | + active: [active] |
| 159 | + # The actual field to use as metric value. Should be a number, boolean or RFC3339 timestamp string. |
| 160 | + valueFrom: [ready] |
| 161 | + commonLabels: |
| 162 | + custom_metric: "yes" |
| 163 | + labelsFromPath: |
| 164 | + # whole objects may be copied into labels by prefixing with "*" |
| 165 | + # *anything will be copied into labels, with the highest sorted * strings first |
| 166 | + "*": [metadata, labels] |
| 167 | + "**": [metadata, annotations] |
| 168 | +
|
| 169 | + # or specific fields may be copied. these fields will always override values from *s |
| 170 | + name: [metadata, name] |
| 171 | + foo: [metadata, labels, foo] |
| 172 | +``` |
| 173 | + |
| 174 | +### Proposed API - CustomResourceStateMetricsV2 |
| 175 | +``` |
| 176 | +kind: CustomResourceStateMetricsV2 |
| 177 | +spec: |
| 178 | + resources: |
| 179 | + - groupVersionKind: |
| 180 | + group: myteam.io |
| 181 | + kind: "Foo" |
| 182 | + version: "v1" |
| 183 | + mode: for_loop # or merged |
| 184 | + metrics: |
| 185 | + - name: "ready_count" |
| 186 | + help: "Number Foo Bars ready" |
| 187 | + values: x.status.sub.map(y, x.status.sub[y].ready) # a cel query. jq '[.status.sub[].ready]', valueFrom: [ready] // [2,4] |
| 188 | + labels: |
| 189 | + - x.status.sub.map(y, {"name": y}) # a cel query. jq '[ .status.sub | keys | .[] | {name: .}]', labelFromKey: type // [{"name": "type-a"}, {"name": "type-b"}] |
| 190 | + - [{ "custom_metric":"yes" }] # a cel query. jq '[{ custom_metric:"yes" }]', custom_metric: "yes" // [{custom_metric="yes"}] |
| 191 | + - [x.metadata.labels] # a cel query. jq '[.metadata.labels]', "*": [metadata, labels] // [{"foo": "bar"}] |
| 192 | + - [x.metadata.annotations] # a cel query. jq '[.metadata.annotations]', "**": [metadata, annotations] // [{"bar": "baz","qux": "quxx"}] |
| 193 | + - [{'name': x.metadata.name}] # a cel query. jq '[{ name: .metadata.name }]', name: [metadata, name] // [{"name": "foo"}] |
| 194 | + - [{'foo': x.metadata.labels.foo}] # a cel query. jq '[{ foo: .metadata.labels.foo }]' # foo: [metadata, labels, foo] // [{foo": "bar"}] |
| 195 | + - [x.status.sub.map(y, {"active": x.status.sub[y].active})] # a cel query. jq '[.status.sub[].active | {active: .}]',labelsFromPath: active: [active] // [{active": 1}, {"active": 3}] |
| 196 | +``` |
| 197 | + |
0 commit comments