Add linkedin/kafka-monitor #97

solsson · 2017-11-10T13:42:00Z

The best answer I've found to #80, on paper :) Remains to learn how to use it.

Adding the monitoring label because the combination of readiness alerts for key health like under-replicated partitions (https://github.com/Yolean/kubernetes-kafka/pull/95/files#diff-f8da94a0c2daaa5e09e08330d1ed122a) and end-to-end testing like kafka-monitor may pay off better than internal metrics.

In actual troubleshooting scenarios you'll probably still want to connect some JMX tool (allowed since #96) to really dig into the state of things.

solsson · 2017-11-10T14:52:00Z

The UI works, the problem was that I used kubectl port-forward and only forwarded port 8000. Actually once the UI is loaded you can switch to forwarding 8778, and you'll get fancy graphs :)

Metrics also works:

curl localhost:8778/jolokia/read/kmf.services:type=produce-service,name=*/records-produced-rate | jq '.'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   268  100   268    0     0   2876      0 --:--:-- --:--:-- --:--:--  2913
{
  "request": {
    "mbean": "kmf.services:name=*,type=produce-service",
    "attribute": "records-produced-rate",
    "type": "read"
  },
  "value": {
    "kmf.services:name=single-cluster-monitor,type=produce-service": {
      "records-produced-rate": 54.47272973797498
    }
  },
  "timestamp": 1510324868,
  "status": 200
}

solsson · 2017-11-10T14:56:31Z

Remaining issues:

Need a way to throttle load on test clusters like minikube. It's pretty significant by default.
Metrics are logged at INFO level, lots and lots of it. With GUI, curl and export to monitoring tools that shouldn't be necessary.

solsson · 2017-11-10T15:15:19Z

Some info on Prometheus (non-)compatibility: jolokia/jolokia#206. Mentions https://github.com/fabric8io/agent-bond, and also the importance of a whitelist as we discovered in #49.

solsson · 2017-11-10T15:26:55Z

Rate can probably be reduced using produce.record.delay.ms, see https://github.com/linkedin/kafka-monitor/wiki/Service-Configuration#produce-service-configuration-parameters.

Here is probably the logging statement. Should be possible to exclude using custom log4j config.

There's also a GraphiteMetricsReporterService so maybe it's trivial to produce a PrometheusMetricsReporterService.

Instead you need some kind of metrics export. Currently I only get a lot of `records-produced-total` but no latencies etc.

solsson added the monitoring label Nov 10, 2017

solsson mentioned this pull request Nov 10, 2017

CrashLoopBackOff caused by Exiting because log truncation is not allowed #98

Closed

solsson added 3 commits September 29, 2018 14:47

Adds linkedin/kafka-monitor, no need for a service

b58c59d

Instead you need some kind of metrics export. Currently I only get a lot of `records-produced-total` but no latencies etc.

Adds a service

08c33c1

Bumps kafka-monitor to use the same JRE as the new Kafka 2.0.0 image

e5b1acf

solsson force-pushed the linkedin-kafka-monitor branch from 5f86aa9 to e5b1acf Compare September 29, 2018 12:49

Kafka Monitor on the JDK 11 image, see #197

67fba85

solsson modified the milestones: 5.0 - Java 11, 5.1 Nov 28, 2018

mazrc approved these changes Dec 20, 2019

View reviewed changes

solsson mentioned this pull request Jan 1, 2020

Zookeeper fails after updating to Kafka 2.4 #298

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add linkedin/kafka-monitor #97

Add linkedin/kafka-monitor #97

Uh oh!

solsson commented Nov 10, 2017 •

edited

Loading

Uh oh!

solsson commented Nov 10, 2017

Uh oh!

solsson commented Nov 10, 2017

Uh oh!

solsson commented Nov 10, 2017

Uh oh!

solsson commented Nov 10, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add linkedin/kafka-monitor #97

Are you sure you want to change the base?

Add linkedin/kafka-monitor #97

Uh oh!

Conversation

solsson commented Nov 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

solsson commented Nov 10, 2017

Uh oh!

solsson commented Nov 10, 2017

Uh oh!

solsson commented Nov 10, 2017

Uh oh!

solsson commented Nov 10, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

solsson commented Nov 10, 2017 •

edited

Loading