【Alerting】【AlertManager】从入门到精通
2018-12-25 本文已影响0人
炼狱腾蛇Eric
1.简介:
- Alertmanager和Prometheus密不可分,是Prometheus的模块之一,不过需要独立安装
- 本文使用rpm安装,版本:0.15.3-1.el7.centos
2.链接:
2.1. 参考文献
- 官方网站:https://prometheus.io/
- 官方文档:https://prometheus.io/docs/alerting/overview/
- 下载地址:https://prometheus.io/download/
- 源代码:https://github.com/prometheus/alertmanager
3. 架构图:
image.png4. 部署:
4.1. rpm安装:(4.1和4.2二选一)
- 好心人打的包
/etc/yum.repos.d/prometheus.repo
[prometheus]
name=prometheus
baseurl=https://packagecloud.io/prometheus-rpm/release/el/7/$basearch
repo_gpgcheck=1
enabled=1
gpgkey=https://packagecloud.io/prometheus-rpm/release/gpgkey
https://raw.githubusercontent.com/lest/prometheus-rpm/master/RPM-GPG-KEY-prometheus-rpm
gpgcheck=1
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
- 运行
yum install -y alertmanager
- 安装完的结构如下
/etc/default/alertmanager # systemd的环境变量
/etc/prometheus/alertmanager.yml # Alertmanager的主配置文件
/usr/bin/alertmanager # Alertmanager的启动文件
/usr/bin/amtool # 查看报警的工具程序
/usr/lib/systemd/system/alertmanager.service # systemd的入口程序
/var/lib/prometheus # 库文件
systemctl start alertmanager && systemctl enable alertmanager
4.2. 二进制包安装:(4.1和4.2二选一)
- 下载二进制包,
wget https://github.com/prometheus/alertmanager/releases/download/v0.15.3/alertmanager-0.15.3.linux-amd64.tar.gz
- 解压
tar xf alertmanager-0.15.3.linux-amd64.tar.gz -C /opt
~]# ll /opt/alertmanager-0.15.3.linux-amd64/
total 31200
-rwxr-xr-x 1 3434 3434 19998160 Nov 9 16:41 alertmanager # Alertmanager的启动文件
-rw-r--r-- 1 3434 3434 380 Nov 9 17:00 alertmanager.yml # Alertmanager的主配置文件
-rwxr-xr-x 1 3434 3434 11923635 Nov 9 16:41 amtool # 查看报警的工具程序
-rw-r--r-- 1 3434 3434 11357 Nov 9 17:00 LICENSE
-rw-r--r-- 1 3434 3434 457 Nov 9 17:00 NOTICE
- 直接运行
./alertmanager
就可以启动
5. 配置文件
5.1. /usr/lib/systemd/system/alertmanager.service
# -*- mode: conf -*-
[Unit]
Description=Prometheus Alertmanager.
Documentation=https://github.com/prometheus/alertmanager
After=network.target
[Service]
EnvironmentFile=-/etc/default/alertmanager
User=prometheus
ExecStart=/usr/bin/alertmanager \
--config.file=/etc/prometheus/alertmanager.yml \
--storage.path=/var/lib/prometheus/alertmanager \
$ALERTMANAGER_OPTS
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
[Install]
WantedBy=multi-user.target
5.2. /etc/default/alertmanager
--web.external-url
ALERTMANAGER_OPTS='\
--web.external-url=http://10.41.91.91:9093 \ # 被外部访问的地址,10.41.91.91是本机地址,其他服务器的配置请记得修改这个
--cluster.listen-address=10.41.91.91:9094 \ # 本机被集群监听的地址
--cluster.peer=10.41.91.91:9094 \ # 本机监听其他集群的地址
--cluster.peer=10.210.149.26:9094 \
--cluster.peer=10.210.149.27:9094'
5.3. /etc/prometheus/alertmanager.yml
global: # 全局配置
resolve_timeout: 5m # 解决报警时间间隔
route: # 分发的规则
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'web.hook'
receivers: # 接受者,可以是邮箱,wechat或者web接口等等
- name: 'web.hook'
webhook_configs:
- url: 'http://127.0.0.1:5001/'
inhibit_rules: # 抑制的规则
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
6. 管理工具
6.1. amtool
amtool alert --alertmanager.url=http://localhost:9093
Alertname Starts At Summary
RootfsUsage 2019-01-10 07:13:32 CET Not enough space for root fs on 10.210.54.227:9100
RootfsUsage 2019-01-11 14:36:17 CET Not enough space for root fs on 10.210.54.226:9100
MemoryUsage 2019-01-17 00:44:17 CET Memory of instance 150.132.195.26:9100 is not enough
6.2. web UI
http://你的服务器IP:9093
image.png