Kafka监控 – kafka_exporter

Build && Run

Github: https://github.com/danielqsj/kafka_exporter

可通过git clone源码自行编译:

1
2
3
git clone https://github.com/danielqsj/kafka_exporter
cd kafka_exporter
go build .

或者下载release (当前最新版本是v1.2.0):

1
2
3
wget https://github.com/danielqsj/kafka_exporter/releases/download/v1.2.0/kafka_exporter-1.2.0.linux-amd64.tar.gz

tar xvf kafka_exporter-1.2.0.linux-amd64.tar.gz

准备

测试集群信息:

1
2
3
4
5
6
7
8
9
10
# kafka集群
10.0.42.61:9092
10.0.42.62:9092
10.0.42.63:9092

# prometheus
10.0.42.61:9090

# kafka_exporter
10.0.42.61:9308

指定所有的kafka server,运行:

1
./kafka_exporter --kafka.server=10.0.42.61:9092 --kafka.server=10.0.42.62:9092 --kafka.server=10.0.42.63:9092 --web.listen-address=:9308

之后可以在 http://10.0.42.61:9308/metrics找到监控信息

配置Prometheus

prometheus增加相关job配置

1
2
3
4
5
6
- job_name: 'kafka_exporter'
static_configs:
- targets:
- 10.0.42.61:9308
labels:
region: oss-test

Grafana

grafana -> Create -> import

添加id:7589

Import后,调整Prometheus数据源即可

upload successful

Alert

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
groups:
- name: alert.rules
rules:
- alert: kafka_brokers
expr: kafka_brokers < 3
for: 1m
labels:
region: {{ $labels.region }}
env: test
level: emergency
expr: kafka_brokers < 3
annotations:
description: 'cluster: {{ $labels.cluster }}, instance: {{ $labels.instance }}, values: {{ $value }}'
value: '{{ $value }}'
summary: One or more kafka brokers are down