Skip to content

Commit 46aa1c4

Browse files
author
Miguel Varela Ramos
authored
Log relevant kubernetes events with each deployed API (#1906)
1 parent ed433a2 commit 46aa1c4

File tree

21 files changed

+422
-107
lines changed

21 files changed

+422
-107
lines changed

CONTRIBUTING.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,7 @@ image_prometheus_config_reloader: <account_id>.dkr.ecr.<region>.amazonaws.com/co
194194
image_prometheus_operator: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/prometheus-operator:master
195195
image_prometheus_statsd_exporter: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/prometheus-statsd-exporter:master
196196
image_grafana: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/grafana:master
197+
image_event_exporter: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/event-exporter:master
197198
```
198199
199200
Create `dev/config/cluster-gcp.yaml`. Paste the following config, and update `project`, `zone`, and all registry URLs (replace `<project_id>` with your project ID, and update `gcr.io` if you are using a different host):
@@ -222,6 +223,7 @@ image_prometheus_config_reloader: gcr.io/<project_id>/cortexlabs/prometheus-conf
222223
image_prometheus_operator: gcr.io/<project_id>/cortexlabs/prometheus-operator:master
223224
image_prometheus_statsd_exporter: gcr.io/<project_id>/cortexlabs/prometheus-statsd-exporter:master
224225
image_grafana: gcr.io/<project_id>/cortexlabs/grafana:master
226+
image_event_exporter: gcr.io/<project_id>/cortexlabs/event-exporter:master
225227
```
226228

227229
### Building

build/images.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ non_dev_images_cluster=(
5959
"prometheus-operator"
6060
"prometheus-statsd-exporter"
6161
"grafana"
62+
"event-exporter"
6263
)
6364
non_dev_images_aws=(
6465
# includes non_dev_images_cluster
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
apiVersion: v1
2+
kind: ServiceAccount
3+
metadata:
4+
namespace: {{ .Release.Namespace }}
5+
name: event-exporter
6+
7+
---
8+
9+
apiVersion: rbac.authorization.k8s.io/v1
10+
kind: ClusterRoleBinding
11+
metadata:
12+
name: event-exporter
13+
roleRef:
14+
apiGroup: rbac.authorization.k8s.io
15+
kind: ClusterRole
16+
name: view
17+
subjects:
18+
- kind: ServiceAccount
19+
namespace: {{ .Release.Namespace }}
20+
name: event-exporter
21+
22+
---
23+
24+
apiVersion: v1
25+
kind: ConfigMap
26+
metadata:
27+
name: event-exporter-config
28+
namespace: {{ .Release.Namespace }}
29+
data:
30+
config.yaml: |
31+
logLevel: error
32+
logFormat: json
33+
route:
34+
routes:
35+
- match:
36+
- receiver: "stdout"
37+
labels:
38+
cortex.dev/api: true
39+
receivers:
40+
- name: "stdout"
41+
file:
42+
path: "/dev/stdout"
43+
44+
---
45+
46+
apiVersion: apps/v1
47+
kind: Deployment
48+
metadata:
49+
name: event-exporter
50+
namespace: {{ .Release.Namespace }}
51+
spec:
52+
replicas: 1
53+
selector:
54+
matchLabels:
55+
app: event-exporter
56+
template:
57+
metadata:
58+
labels:
59+
app: event-exporter
60+
spec:
61+
serviceAccountName: event-exporter
62+
containers:
63+
- name: event-exporter
64+
image: {{ .Values.cortex.image_event_exporter }}
65+
imagePullPolicy: IfNotPresent
66+
args:
67+
- -conf=/data/config.yaml
68+
volumeMounts:
69+
- mountPath: /data
70+
name: event-exporter-config
71+
volumes:
72+
- name: event-exporter-config
73+
configMap:
74+
name: event-exporter-config

charts/templates/fluentbit.yaml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,8 @@ data:
5252
5353
@INCLUDE input-kubernetes.conf
5454
@INCLUDE filter-kubernetes.conf
55+
@INCLUDE filter-k8s-events.conf
56+
@INCLUDE filter-stackdriver-format.conf
5557
@INCLUDE output.conf
5658
5759
input-kubernetes.conf: |
@@ -104,6 +106,47 @@ data:
104106
Match k8s_container.*
105107
Remove_wildcard k8s.
106108
109+
filter-k8s-events.conf: |
110+
[FILTER]
111+
Name nest
112+
Match k8s_container.*.event-exporter-*
113+
Operation lift
114+
Nested_under involvedObject
115+
Add_prefix involvedObject.
116+
117+
[FILTER]
118+
Name modify
119+
Match k8s_container.*.event-exporter-*
120+
Condition Key_exists labels
121+
Rename labels k8s.labels
122+
123+
[FILTER]
124+
Name modify
125+
Match k8s_container.*.event-exporter-*
126+
Condition Key_exists involvedObject.labels
127+
Hard_copy involvedObject.labels labels
128+
129+
[FILTER]
130+
Name nest
131+
Match k8s_container.*.event-exporter-*
132+
Operation nest
133+
Wildcard involvedObject.*
134+
Nest_under involvedObject
135+
Remove_prefix involvedObject.
136+
137+
filter-stackdriver-format.conf: |
138+
[FILTER]
139+
Name modify
140+
Match k8s_container.*
141+
Condition Key_exists log
142+
Rename log message
143+
144+
[FILTER]
145+
Name modify
146+
Match k8s_container.*
147+
Condition Key_exists levelname
148+
Rename levelname level
149+
107150
output.conf: |
108151
{{- if eq .Values.global.provider "aws" }}
109152
[OUTPUT]
@@ -122,6 +165,8 @@ data:
122165
resource k8s_container
123166
k8s_cluster_name {{ .Values.cortex.cluster_name }}
124167
k8s_cluster_location {{ .Values.cortex.zone }}
168+
severity_key level
169+
labels_key labels
125170
{{- end }}
126171
127172
parsers.conf: |

charts/values.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ cortex:
2929
image_prometheus_operator: quay.io/cortexlabs/prometheus-operator:master
3030
image_prometheus_statsd_exporter: quay.io/cortexlabs/prometheus-statsd-exporter:master
3131
image_grafana: quay.io/cortexlabs/grafana:master
32+
image_event_exporter: quay.io/cortexlabs/event-exporter:master
3233

3334
networking:
3435
istio-discovery:

cli/cmd/lib_cluster_config_aws.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -428,6 +428,10 @@ func setConfigFieldsFromCached(userClusterConfig *clusterconfig.Config, cachedCl
428428
return clusterconfig.ErrorConfigCannotBeChangedOnUpdate(clusterconfig.ImageGrafanaKey, cachedClusterConfig.ImageGrafana)
429429
}
430430

431+
if s.Obj(cachedClusterConfig.ImageEventExporter) != s.Obj(userClusterConfig.ImageEventExporter) {
432+
return clusterconfig.ErrorConfigCannotBeChangedOnUpdate(clusterconfig.ImageEventExporterKey, cachedClusterConfig.ImageEventExporter)
433+
}
434+
431435
if userClusterConfig.Spot != nil && *userClusterConfig.Spot != *cachedClusterConfig.Spot {
432436
return clusterconfig.ErrorConfigCannotBeChangedOnUpdate(clusterconfig.SpotKey, *cachedClusterConfig.Spot)
433437
}

dev/versions.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -330,6 +330,13 @@ supported (<https://github.com/awslabs/amazon-eks-ami/issues/176>)
330330
1. Update the base image version in `images/grafana/Dockerfile`.
331331
1. Update `grafana.yaml` as necessary, if that's the case.
332332

333+
## Event Exporter
334+
335+
1. Find the latest release
336+
on [GitHub](https://github.com/opsgenie/kubernetes-event-exporter).
337+
1. Update the base image version in `images/event-exporter/Dockerfile`.
338+
1. Update `event-exporter.yaml` as necessary, if that's the case.
339+
333340
## aws-iam-authenticator
334341

335342
1. Find the latest release [here](https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html)

docs/clusters/aws/install.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,4 +109,5 @@ image_prometheus_config_reloader: quay.io/cortexlabs/prometheus-config-reloader:
109109
image_prometheus_operator: quay.io/cortexlabs/prometheus-operator:master
110110
image_prometheus_statsd_exporter: quay.io/cortexlabs/prometheus-statsd-exporter:master
111111
image_grafana: quay.io/cortexlabs/grafana:master
112+
image_event_exporter: quay.io/cortexlabs/event-exporter:master
112113
```

docs/clusters/gcp/install.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,4 +83,5 @@ image_prometheus_config_reloader: quay.io/cortexlabs/prometheus-config-reloader:
8383
image_prometheus_operator: quay.io/cortexlabs/prometheus-operator:master
8484
image_prometheus_statsd_exporter: quay.io/cortexlabs/prometheus-statsd-exporter:master
8585
image_grafana: quay.io/cortexlabs/grafana:master
86+
image_event_exporter: quay.io/cortexlabs/event-exporter:master
8687
```

docs/clusters/gcp/logging.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,18 +7,18 @@ RealtimeAPI:
77
```text
88
resource.type="k8s_container"
99
resource.labels.cluster_name="<INSERT CLUSTER NAME>"
10-
jsonPayload.labels.apiKind="RealtimeAPI"
11-
jsonPayload.labels.apiName="<INSERT API NAME>"
10+
labels.apiKind="RealtimeAPI"
11+
labels.apiName="<INSERT API NAME>"
1212
```
1313

1414
TaskAPI:
1515

1616
```text
1717
resource.type="k8s_container"
1818
resource.labels.cluster_name="<INSERT CLUSTER NAME>"
19-
jsonPayload.labels.apiKind="TaskAPI"
20-
jsonPayload.labels.apiName="<INSERT API NAME>"
21-
jsonPayload.labels.jobID="<INSERT JOB ID>"
19+
labels.apiKind="TaskAPI"
20+
labels.apiName="<INSERT API NAME>"
21+
labels.jobID="<INSERT JOB ID>"
2222
```
2323

2424
Please make sure to navigate to the project containing your cluster and adjust the time range accordingly before running queries.

0 commit comments

Comments
 (0)