一、Prometheus监控MySQL示意图
1、环境信息
192.168.201.165 Prometheus Grafana
192.168.201.161 MySQL mysqld_exporter node_exporter
192.168.201.162 MySQL mysqld_exporter node_exporter
192.168.201.163 MySQL mysqld_exporter node_exporter
2、查看数据库进程
ps -ef | grep mysql
二、部署mysqld_exporter
在161、162、163节点部署mysqld_exporter
1、获取mysqld_exporter的下载地址
https://prometheus.io/download/#mysqld_exporter
2、进行下载mysqld_exporter
cd /usr/src/
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.17.2/mysqld_exporter-0.17.2.linux-amd64.tar.gz
ll
如果下载慢的话,可到网盘进行下载:
链接: https://pan.baidu.com/s/1EiVi7hr2OHDuJVESAu93sw?pwd=r9md
提取码: r9md
3、解压mysqld_exporter安装包
cd /usr/src/
tar zxvf mysqld_exporter-0.17.2.linux-amd64.tar.gz -C /usr/local/
mv /usr/local/mysqld_exporter-0.17.2.linux-amd64 /usr/local/mysqld_exporter
4、新增mysqld_exporter启动脚本
vim /etc/systemd/system/mysqld_exporter.service
添加如下内容:
[Unit]
Description=Prometheus
After=network-online.target
[Service]
User=prometheus
Restart=on-failure
ExecStart=/usr/local/mysqld_exporter/mysqld_exporter --config.my-cnf=/usr/local/mysqld_exporter/mysqld_exporter.cnf --collect.perf_schema.replication_group_members --collect.perf_schema.replication_group_member_stats
[Install]
WantedBy=multi-user.target
5、创建监控用户
这个用户是用来取mysql监控的数据的
(1)查看MGR集群的主节点
mysqlsh -umgr_user -p'admin' -h192.168.201.161
这里的Primary节点是node01节点也就是161节点
var cluster = dba.getCluster('Cluster01')
cluster.status();
(2)创建监控用户
在161节点创建用户
create user 'exporter'@'127.0.0.1' identified with mysql_native_password BY 'admin';
GRANT SELECT, PROCESS, REPLICATION CLIENT, RELOAD ON *.* TO 'exporter'@'127.0.0.1';
6、配置mysqld_exporter的配置文件
vim /usr/local/mysqld_exporter/mysqld_exporter.cnf
配置MySQL监控用户信息
[client]
user=exporter
password=admin
host = 127.0.0.1
7、新建prometheus用户
这里的prometheus用户应该在前面新建好了,如果没有的话就建下。
useradd prometheus
给文件夹添加属组
chown -R prometheus.prometheus /usr/local/mysqld_exporter/
8、重新加载系统服务的配置文件
systemctl daemon-reload
9、启动mysqld_exporter
(1)重新加载服务
systemctl start mysqld_exporter
(2)启动mysqld_exporter
systemctl start mysqld_exporter
systemctl status mysqld_exporter
(3)页面查看
http://192.168.201.161:9104/metrics
http://192.168.201.162:9104/metrics
http://192.168.201.163:9104/metrics
三、Prometheus添加MySQL监控的配置
1、修改prometheus配置
Prometheus部署在165上,在165进行新增mysql机器的配置文件地址。
vim /usr/local/prometheus/prometheus.yml
新增如下内容:
static_configs:
- targets: ["localhost:9090"]
- job_name: 'MySQL'
file_sd_configs:
- files: ["/usr/local/prometheus/conf.d/mysql.json"]
refresh_interval: 15s
2、增加MySQL机器配置文件
vim /usr/local/prometheus/conf.d/mysql.json
添加如下内容:
[
{
"targets": [
"192.168.201.161:9104"
],
"labels": {
"servicename": "MySQL",
"env": "Prod",
"hostname": "node01"
}
},
{
"targets": [
"192.168.201.162:9104"
],
"labels": {
"servicename": "MySQL",
"env": "Prod",
"hostname": "node02"
}
},
{
"targets": [
"192.168.201.163:9104"
],
"labels": {
"servicename": "MySQL",
"env": "Prod",
"hostname": "node03"
}
}
]
3、检查prometheus配置文件
/usr/local/prometheus/promtool check config /usr/local/prometheus/prometheus.yml
4、重启Prometheus
systemctl restart prometheus
systemctl status prometheus
5、登录Prometheus查看MySQL的监控
(1)页面查看
(2)搜索MySQL相关参数
例如:mysql_global_variables_innodb_buffer_pool_size
mysql_global_variables_innodb_buffer_pool_size/1024/1024/1024
四、在Grafana配置MySQL监控图
1、导入模板
http://192.168.201.165:3000/
输入模板ID,MySQL的模板我常用的是7362
也可以自己选择
点击Load
稍等一会
选择数据源,点击Import
2、查看监控数据
五、配置MySQL异常项的告警
1、配置rule文件
在165节点,也就是Prometheus的机器上进行如下操作
vim /usr/local/prometheus/rules.d/mysql_rules.yml
内容如下:
groups:
- name: mysql_exporter_alerts
rules:
- alert: MysqlDown
expr: mysql_up == 0
for: 0m
labels:
severity: critical
annotations:
summary: MySQL down (instance {{ $labels.instance }})
description: "MySQL instance is down on {{ $labels.instance }}
VALUE = {{ $value }}
LABELS = {{ $labels }}"
- alert: MysqlTooManyConnections(>80%)
expr: max_over_time(mysql_global_status_threads_connected[1m]) / mysql_global_variables_max_connections * 100 > 80
for: 2m
labels:
severity: warning
annotations:
summary: MySQL too many connections (> 80%) (instance {{ $labels.instance }})
description: "More than 80% of MySQL connections are in use on {{ $labels.instance }}
VALUE = {{ $value }}
LABELS = {{ $labels }}"
- alert: MysqlHighThreadsRunning
expr: max_over_time(mysql_global_status_threads_running[1m]) / mysql_global_variables_max_connections * 100 > 60
for: 2m
labels:
severity: warning
annotations:
summary: MySQL high threads running (instance {{ $labels.instance }})
description: "More than 60% of MySQL connections are in running state on {{ $labels.instance }}
VALUE = {{ $value }}
LABELS = {{ $labels }}"
- alert: MysqlMGRMemberOffline
expr: mysql_perf_schema_replication_group_member_info{member_state!="ONLINE"} >0
for: 0m
labels:
severity: critical
annotations:
summary: MySQL MGR Status is Offline (instance {{ $labels.instance }})
description: "MGR is Offline on {{ $labels.instance }}
VALUE = {{ $value }}
LABELS = {{ $labels }}"
- alert: MysqlSlowQueries
expr: increase(mysql_global_status_slow_queries[1m]) > 0
for: 2m
labels:
severity: warning
annotations:
summary: MySQL slow queries (instance {{ $labels.instance }})
description: "MySQL server mysql has some new slow query.
VALUE = {{ $value }}
LABELS = {{ $labels }}"
- alert: MysqlInnodbLogWaits
expr: rate(mysql_global_status_innodb_log_waits[15m]) > 10
for: 0m
labels:
severity: warning
annotations:
summary: MySQL InnoDB log waits (instance {{ $labels.instance }})
description: "MySQL innodb log writes stalling
VALUE = {{ $value }}
LABELS = {{ $labels }}"
- alert: MysqlRestarted
expr: mysql_global_status_uptime < 60
for: 0m
labels:
severity: info
annotations:
summary: MySQL restarted (instance {{ $labels.instance }})
description: "MySQL has just been restarted, less than one minute ago on {{ $labels.instance }}.
VALUE = {{ $value }}
LABELS = {{ $labels }}"
2、校验Prometheus配置文件并重启
/usr/local/prometheus/promtool check config /usr/local/prometheus/prometheus.yml
重启Prometheus
systemctl restart prometheus
systemctl status prometheus
3、测试告警
当前mysql的节点是161、162、163,其中161是主节点,来关闭163节点的数据库。
(1)关闭163节点数据库
/etc/init.d/mysql.server stop