Skip to content

Log relevant kubernetes events with each deployed API #1906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 26, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,7 @@ image_prometheus_config_reloader: <account_id>.dkr.ecr.<region>.amazonaws.com/co
image_prometheus_operator: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/prometheus-operator:master
image_prometheus_statsd_exporter: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/prometheus-statsd-exporter:master
image_grafana: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/grafana:master
image_event_exporter: <account_id>.dkr.ecr.<region>.amazonaws.com/cortexlabs/event-exporter:master
```

Create `dev/config/cluster-gcp.yaml`. Paste the following config, and update `project`, `zone`, and all registry URLs (replace `<project_id>` with your project ID, and update `gcr.io` if you are using a different host):
Expand Down Expand Up @@ -222,6 +223,7 @@ image_prometheus_config_reloader: gcr.io/<project_id>/cortexlabs/prometheus-conf
image_prometheus_operator: gcr.io/<project_id>/cortexlabs/prometheus-operator:master
image_prometheus_statsd_exporter: gcr.io/<project_id>/cortexlabs/prometheus-statsd-exporter:master
image_grafana: gcr.io/<project_id>/cortexlabs/grafana:master
image_event_exporter: gcr.io/<project_id>/cortexlabs/event-exporter:master
```

### Building
Expand Down
1 change: 1 addition & 0 deletions build/images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ non_dev_images_cluster=(
"prometheus-operator"
"prometheus-statsd-exporter"
"grafana"
"event-exporter"
)
non_dev_images_aws=(
# includes non_dev_images_cluster
Expand Down
74 changes: 74 additions & 0 deletions charts/templates/event-exporter.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: {{ .Release.Namespace }}
name: event-exporter

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: event-exporter
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: view
subjects:
- kind: ServiceAccount
namespace: {{ .Release.Namespace }}
name: event-exporter

---

apiVersion: v1
kind: ConfigMap
metadata:
name: event-exporter-config
namespace: {{ .Release.Namespace }}
data:
config.yaml: |
logLevel: error
logFormat: json
route:
routes:
- match:
- receiver: "stdout"
labels:
cortex.dev/api: true
receivers:
- name: "stdout"
file:
path: "/dev/stdout"

---

apiVersion: apps/v1
kind: Deployment
metadata:
name: event-exporter
namespace: {{ .Release.Namespace }}
spec:
replicas: 1
selector:
matchLabels:
app: event-exporter
template:
metadata:
labels:
app: event-exporter
spec:
serviceAccountName: event-exporter
containers:
- name: event-exporter
image: {{ .Values.cortex.image_event_exporter }}
imagePullPolicy: IfNotPresent
args:
- -conf=/data/config.yaml
volumeMounts:
- mountPath: /data
name: event-exporter-config
volumes:
- name: event-exporter-config
configMap:
name: event-exporter-config
45 changes: 45 additions & 0 deletions charts/templates/fluentbit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ data:

@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE filter-k8s-events.conf
@INCLUDE filter-stackdriver-format.conf
@INCLUDE output.conf

input-kubernetes.conf: |
Expand Down Expand Up @@ -104,6 +106,47 @@ data:
Match k8s_container.*
Remove_wildcard k8s.

filter-k8s-events.conf: |
[FILTER]
Name nest
Match k8s_container.*.event-exporter-*
Operation lift
Nested_under involvedObject
Add_prefix involvedObject.

[FILTER]
Name modify
Match k8s_container.*.event-exporter-*
Condition Key_exists labels
Rename labels k8s.labels

[FILTER]
Name modify
Match k8s_container.*.event-exporter-*
Condition Key_exists involvedObject.labels
Hard_copy involvedObject.labels labels

[FILTER]
Name nest
Match k8s_container.*.event-exporter-*
Operation nest
Wildcard involvedObject.*
Nest_under involvedObject
Remove_prefix involvedObject.

filter-stackdriver-format.conf: |
[FILTER]
Name modify
Match k8s_container.*
Condition Key_exists log
Rename log message

[FILTER]
Name modify
Match k8s_container.*
Condition Key_exists levelname
Rename levelname level

output.conf: |
{{- if eq .Values.global.provider "aws" }}
[OUTPUT]
Expand All @@ -122,6 +165,8 @@ data:
resource k8s_container
k8s_cluster_name {{ .Values.cortex.cluster_name }}
k8s_cluster_location {{ .Values.cortex.zone }}
severity_key level
labels_key labels
{{- end }}

parsers.conf: |
Expand Down
1 change: 1 addition & 0 deletions charts/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ cortex:
image_prometheus_operator: quay.io/cortexlabs/prometheus-operator:master
image_prometheus_statsd_exporter: quay.io/cortexlabs/prometheus-statsd-exporter:master
image_grafana: quay.io/cortexlabs/grafana:master
image_event_exporter: quay.io/cortexlabs/event-exporter:master

networking:
istio-discovery:
Expand Down
4 changes: 4 additions & 0 deletions cli/cmd/lib_cluster_config_aws.go
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,10 @@ func setConfigFieldsFromCached(userClusterConfig *clusterconfig.Config, cachedCl
return clusterconfig.ErrorConfigCannotBeChangedOnUpdate(clusterconfig.ImageGrafanaKey, cachedClusterConfig.ImageGrafana)
}

if s.Obj(cachedClusterConfig.ImageEventExporter) != s.Obj(userClusterConfig.ImageEventExporter) {
return clusterconfig.ErrorConfigCannotBeChangedOnUpdate(clusterconfig.ImageEventExporterKey, cachedClusterConfig.ImageEventExporter)
}

if userClusterConfig.Spot != nil && *userClusterConfig.Spot != *cachedClusterConfig.Spot {
return clusterconfig.ErrorConfigCannotBeChangedOnUpdate(clusterconfig.SpotKey, *cachedClusterConfig.Spot)
}
Expand Down
7 changes: 7 additions & 0 deletions dev/versions.md
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,13 @@ supported (<https://github.com/awslabs/amazon-eks-ami/issues/176>)
1. Update the base image version in `images/grafana/Dockerfile`.
1. Update `grafana.yaml` as necessary, if that's the case.

## Event Exporter

1. Find the latest release
on [GitHub](https://github.com/opsgenie/kubernetes-event-exporter).
1. Update the base image version in `images/event-exporter/Dockerfile`.
1. Update `event-exporter.yaml` as necessary, if that's the case.

## aws-iam-authenticator

1. Find the latest release [here](https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html)
Expand Down
1 change: 1 addition & 0 deletions docs/clusters/aws/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,4 +109,5 @@ image_prometheus_config_reloader: quay.io/cortexlabs/prometheus-config-reloader:
image_prometheus_operator: quay.io/cortexlabs/prometheus-operator:master
image_prometheus_statsd_exporter: quay.io/cortexlabs/prometheus-statsd-exporter:master
image_grafana: quay.io/cortexlabs/grafana:master
image_event_exporter: quay.io/cortexlabs/event-exporter:master
```
1 change: 1 addition & 0 deletions docs/clusters/gcp/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,4 +83,5 @@ image_prometheus_config_reloader: quay.io/cortexlabs/prometheus-config-reloader:
image_prometheus_operator: quay.io/cortexlabs/prometheus-operator:master
image_prometheus_statsd_exporter: quay.io/cortexlabs/prometheus-statsd-exporter:master
image_grafana: quay.io/cortexlabs/grafana:master
image_event_exporter: quay.io/cortexlabs/event-exporter:master
```
10 changes: 5 additions & 5 deletions docs/clusters/gcp/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,18 @@ RealtimeAPI:
```text
resource.type="k8s_container"
resource.labels.cluster_name="<INSERT CLUSTER NAME>"
jsonPayload.labels.apiKind="RealtimeAPI"
jsonPayload.labels.apiName="<INSERT API NAME>"
labels.apiKind="RealtimeAPI"
labels.apiName="<INSERT API NAME>"
```

TaskAPI:

```text
resource.type="k8s_container"
resource.labels.cluster_name="<INSERT CLUSTER NAME>"
jsonPayload.labels.apiKind="TaskAPI"
jsonPayload.labels.apiName="<INSERT API NAME>"
jsonPayload.labels.jobID="<INSERT JOB ID>"
labels.apiKind="TaskAPI"
labels.apiName="<INSERT API NAME>"
labels.jobID="<INSERT JOB ID>"
```

Please make sure to navigate to the project containing your cluster and adjust the time range accordingly before running queries.
Expand Down
1 change: 1 addition & 0 deletions images/event-exporter/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
FROM opsgenie/kubernetes-event-exporter:0.9
8 changes: 4 additions & 4 deletions manager/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@ function cluster_up_aws() {
echo "✓"

echo -n "○ configuring logging "
python render_template.py $CORTEX_CLUSTER_CONFIG_FILE manifests/fluent-bit.yaml.j2 > /workspace/fluent-bit.yaml
kubectl apply -f /workspace/fluent-bit.yaml >/dev/null
python render_template.py $CORTEX_CLUSTER_CONFIG_FILE manifests/fluent-bit.yaml.j2 | kubectl apply -f - >/dev/null
envsubst < manifests/event-exporter.yaml | kubectl apply -f - >/dev/null
echo "✓"

echo -n "○ configuring metrics "
Expand Down Expand Up @@ -120,8 +120,8 @@ function cluster_up_gcp() {
echo "✓"

echo -n "○ configuring logging "
python render_template.py $CORTEX_CLUSTER_CONFIG_FILE manifests/fluent-bit.yaml.j2 > /workspace/fluent-bit.yaml
kubectl apply -f /workspace/fluent-bit.yaml >/dev/null
python render_template.py $CORTEX_CLUSTER_CONFIG_FILE manifests/fluent-bit.yaml.j2 | kubectl apply -f - >/dev/null
envsubst < manifests/event-exporter.yaml | kubectl apply -f - >/dev/null
echo "✓"

echo -n "○ configuring metrics "
Expand Down
88 changes: 88 additions & 0 deletions manager/manifests/event-exporter.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Copyright 2021 Cortex Labs, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: v1
kind: ServiceAccount
metadata:
namespace: default
name: event-exporter

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: event-exporter
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: view
subjects:
- kind: ServiceAccount
namespace: default
name: event-exporter

---

apiVersion: v1
kind: ConfigMap
metadata:
name: event-exporter-config
namespace: default
data:
config.yaml: |
logLevel: error
logFormat: json
route:
routes:
- match:
- receiver: "stdout"
labels:
cortex.dev/api: true
receivers:
- name: "stdout"
file:
path: "/dev/stdout"

---

apiVersion: apps/v1
kind: Deployment
metadata:
name: event-exporter
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: event-exporter
template:
metadata:
labels:
app: event-exporter
spec:
serviceAccountName: event-exporter
containers:
- name: event-exporter
image: $CORTEX_IMAGE_EVENT_EXPORTER
imagePullPolicy: IfNotPresent
args:
- -conf=/data/config.yaml
volumeMounts:
- mountPath: /data
name: event-exporter-config
volumes:
- name: event-exporter-config
configMap:
name: event-exporter-config
Loading