监控组件:kafka-exporter
github地址:GitHub - imduffy15/kafka_exporter: Kafka exporter for Prometheus
启动:
docker run -d \
--restart=always \
--restart=on-failure:5 \
--name kafka_exporter \
-v /etc/localtime:/etc/localtime \
-p 9308:9308 \
danielqsj/kafka-exporter:v1.2.0 \
--kafka.server=172.30.0.11:9092
--kafka.server=172.30.0.11:9092 可以指定多个kafka
--kafka.server=172.30.0.11:9092
--kafka.server=172.30.0.12:9092
--kafka.server=172.30.0.13:9092
这里演示一个单点监控
prometheus集成kafka_exporter
vim prometheus.yml
# kafka 监控
- job_name: 'kafka-172.30.0.11'
scrape_interval: 10s
static_configs:
- targets: ['192.168.0.39:9308']
labels:
kafka_ip: 'kafka-172.30.0.11'
重启prometheus容器生效
grafana码:7589
https://grafana.com/grafana/dashboards/7589
告警规则:
# cat rules/kafka-export-alert-rules.yaml
groups:
- name: kafka消费滞后告警
rules:
- alert: kafka消费滞后
expr: sum(kafka_consumergroup_lag{topic!="sop_free_study_fix-student_wechat_detail"}) by (consumergroup, topic) > 1000
for: 3m
labels:
serverity: warning
status: 严重
annotations:
summary: "kafka消费滞后"
description: "{{$.Labels.consumergroup}}##{{$.Labels.topic}}:消费滞后超过1000持续3分钟(当前:{{$value}})"
- alert: kafka-exporter down
expr: kafka_exporter_build_info < 1
for: 3m
labels:
serverity: warning
status: 严重
annotations:
summary: "kafka-exporter down"
description: "kafka-exporter down {{$.Labels.instance}}"
- alert: kafka server down
expr: kafka_brokers < 1
for: 3m
labels:
serverity: warning
status: 严重
annotations:
summary: "kafka server down"
description: "kafka server down {{$.Labels.job}}"
多点监控参照文章:
prometheus监控kafka_蝎的博客-CSDN博客_prometheus监控kafka
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)