Prometheus+Grafana搭建和使用
Prometheus文件下载
官网下载
https://prometheus.io/download/
我这里使用的版本是
prometheus-2.53.3.linux-amd64.tar.gz
Prometheus安装
将文件上传至/home/software/
cd /home/software/
tar -zxvf prometheus-2.53.3.linux-amd64.tar.gz
Prometheus启动
新建用户及赋予权限
# 增加用户
useradd prometheus
# 设置密码
passwd prometheus
# 给prometheus用户增加权限
chown -R prometheus /home/software/prometheus-2.53.3.linux-amd64
切换用户
su prometheus
启动服务
cd prometheus-2.53.3.linux-amd64
./prometheus --config.file=prometheus.yml
这个时候就应该可以通过9090端口进行访问了
修改默认端口:
1.启动命令修改
./prometheus --config.file=prometheus.yml --web.listen-address=":8080"
后台启动
nohup ./prometheus --config.file=prometheus.yml > prometheus.log 2>&1 &
默认配置文件解读
# my global config 配置文件
global:
# 设置抓取间隔为每15秒,默认是每分钟。
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
# 每15秒评估一次规则,默认是每分钟。
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# 抓取超时时间,即Prometheus等待目标响应的最大时间。在这里没有显式设置它使用的是全局默认值10秒。
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
# 告警管理器配置
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
# 指定了Prometheus加载的告警规则和记录规则的文件路径
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
# 抓取配置 这里是Prometheus抓取自身指标的一个示例,默认调用localhost:9090/metrics
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
自动启动
将Prometheus设置为centos服务(root 账户)
vi /etc/systemd/system/prometheus.service
文件内容
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
# 修改web.listen-address为你想要的端口号
ExecStart=/home/software/prometheus-2.53.3.linux-amd64/prometheus \
--config.file=/home/software/prometheus-2.53.3.linux-amd64/prometheus.yml \
--storage.tsdb.path=/home/software/prometheus-2.53.3.linux-amd64/data \
--web.console.templates=/home/software/prometheus-2.53.3.linux-amd64/consoles \
--web.console.libraries=/home/software/prometheus-2.53.3.linux-amd64/console_libraries \
--web.listen-address=":9090"
[Install]
WantedBy=multi-user.target
加载Prometheus服务(root 账户)
systemctl daemon-reload
# 设置系统自动启动
systemctl enable prometheus
重启Prometheus服务(root 账户)
systemctl restart prometheus
为prometheus 增加sudo重启权限(root账户)
sudo visudo
文件末尾提交内容
prometheus ALL=(ALL) NOPASSWD: /bin/systemctl restart prometheus, /bin/systemctl status prometheus, /bin/systemctl start prometheus, /bin/systemctl stop prometheus
当prometheus重新连入ssh之后,即可使用如下命令来重启prometheus
# 重启
sudo /bin/systemctl restart prometheus
# 状态
sudo /bin/systemctl status prometheus
# 启动
sudo /bin/systemctl start prometheus
# 停止
sudo /bin/systemctl stop prometheus
监控Centos服务器
文件下载
node_exporter-1.8.2.linux-amd64.tar.gz
https://prometheus.io/download/#node_exporter
将文件上传至/home/software/
cd /home/software/
tar -zxvf node_exporter-1.8.2.linux-amd64.tar.gz
node_exporter启动
自动启动
将node_exporter设置为centos服务(root账户)
vi /etc/systemd/system/node_exporter.service
文件内容
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/home/software/node_exporter-1.8.2.linux-amd64/node_exporter
[Install]
WantedBy=multi-user.target
加载、开机启动、启动、状态
sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter
sudo systemctl status node_exporter
验证是否能正常工作
curl http://localhost:9100/metrics
集成
1.修改prometheus.yml文件
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
# 本地node监控
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
2.重启prometheus服务
systemctl restart prometheus
3.在prometheus服务上观察Node Exporter
登录到Prometheus Web界面,默认地址为http://<your-server-ip>:9090
,导航至“Status” > “Targets”,这里列出了所有已配置的抓取目标及其状态。查找Node Exporter条目,并确认它的健康状况。
监控Springboot应用服务
添加依赖
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
配置文件
# application.yml
management:
endpoints:
web:
exposure:
include: "*"
endpoint:
prometheus:
enabled: true
安全
由于在spring-boot服务上开启了actuator端点,服务器相关的端口会暴露出去,会导致许多安全问题。
解决办法
方法 1: 使用 Spring Security 配置
如果你已经在使用Spring Security,可以通过配置它来限制对特定端点的访问。下面是一个示例配置,它将允许来自本地主机的请求,并拒绝所有其他来源的请求。
# application.yml
management:
endpoints:
web:
exposure:
include: "prometheus"
endpoint:
prometheus:
enabled: true
server:
port: 8080
security:
user:
name: prometheus
password: ${PROMETHEUS_PASSWORD}
然后,在你的安全配置类中添加如下代码:
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity;
import org.springframework.security.config.annotation.web.configuration.WebSecurityConfigurerAdapter;
@Configuration
@EnableWebSecurity
public class SecurityConfig extends WebSecurityConfigurerAdapter {
@Override
protected void configure(HttpSecurity http) throws Exception {
http
.authorizeRequests()
.requestMatchers("/actuator/prometheus").hasIpAddress("127.0.0.1") // 允许来自本地主机的请求
.anyRequest().authenticated() // 其他请求需要身份验证
.and()
.httpBasic(); // 启用HTTP基本认证
}
}
这种方法不仅限于IP地址过滤,还可以结合其他认证方式如API密钥或OAuth2等来进一步加强安全性。
方法2: 使用 NGINX 或其他反向代理
server {
listen 80; # 监听HTTP请求,默认端口80
server_name yourdomain.com; # 替换为你的域名或IP地址
location / {
proxy_pass http://127.0.0.1:8888;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# 可选:增加超时设置以适应慢速客户端或大型响应
proxy_read_timeout 300s;
proxy_send_timeout 300s;
}
# 新增:限制对 /actuator 的访问
location ^~ /actuator/ {
allow 127.0.0.1; # 允许来自本地主机的请求
deny all; # 拒绝所有其他来源的请求
proxy_pass http://127.0.0.1:8888;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
其他方法
使用防火墙规则、使用嵌入式Tomcat的Connector配置 等等等等。。。。。
Spring Boot Actuator端点介绍
提供了多个内置端点,每个端点都有其特定的功能。以下是部分常用的端点及其功能简介:
- health:提供应用程序健康检查的信息。
- info:展示应用程序的自定义信息,如版本号等。
- metrics:提供详细的度量数据,包括内存使用、线程数等。
- prometheus:专门为Prometheus设计的端点,输出Prometheus格式的度量数据。
- env:显示应用程序环境变量和配置属性。
- logfile:提供对应用程序日志文件的访问。
- heapdump:生成并下载Java堆转储文件。
- threaddump:生成并下载线程转储文件。
- mappings:列出所有@RequestMapping路径。
- conditions:显示自动配置条件评估结果。
- scheduledtasks:展示已安排的任务。
- httptrace:记录最近的HTTP跟踪信息。
- beans:显示应用程序上下文中的所有Spring Beans。
- loggers:允许查看和修改日志级别。
监控MYSQL应用服务
文件下载
mysqld_exporter-0.16.0.linux-amd64.tar.gz
https://prometheus.io/download/#mysqld_exporter
自动启动
将mysqld_exporter设置为centos服务(root账户)
vi /etc/systemd/system/mysqld_exporter.service
文件内容
[Unit]
Description=mysqld_exporter Service
After=network.target mysql.service
[Service]
User=prometheus
Group=prometheus
ExecStart=/home/software/mysqld_exporter-0.16.0.linux-amd64/mysqld_exporter \
--web.listen-address=:9104 \
--config.my-cnf=/home/software/mysqld_exporter-0.16.0.linux-amd64/my.cnf
Restart=on-failure
# 设置超时时间以适应慢速初始化
TimeoutStartSec=120
[Install]
WantedBy=multi-user.target
加载、开机启动、启动、状态
sudo systemctl daemon-reload
sudo systemctl start mysqld_exporter
sudo systemctl enable mysqld_exporter
sudo systemctl status mysqld_exporter
验证是否能正常工作
curl http://localhost:9104/metrics
集成
1.修改prometheus.yml文件
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
# 本地node监控
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
2.重启prometheus服务
systemctl restart prometheus
开启prometheus开启授权登录
依据为:
https://prometheus.io/docs/guides/basic-auth/#securing-prometheus-api-and-ui-endpoints-using-basic-auth
yum install epel-release
yum install python3-bcrypt
yum install python3-pip
yum install epel-release
yum install python3-bcrypt
yum install -y rust cargo
pip3 install bcrypt
pip3 install setuptools-rust
pip3 install bcrypt
生成一个gen-pass.py的脚本
import getpass
import bcrypt
password = getpass.getpass("password: ")
hashed_password = bcrypt.hashpw(password.encode("utf-8"), bcrypt.gensalt())
print(hashed_password.decode())
运行它
python3 gen-pass.py
比如test的密码会这样
password:
$2b$12$hNf2lSsxfm0.i4a.1kVpSOVyBCfIB51VRjgBUyv6kdnyTlgWj81Ay
在Prometheus目录创建一个web.config
basic_auth_users:
admin: $2b$12$hNf2lSsxfm0.i4a.1kVpSOVyBCfIB51VRjgBUyv6kdnyTlgWj81Ay
通过promtool来检验文件是否有问题
$ promtool check web-config web.yml
web.yml SUCCESS
重启服务
prometheus --web.config.file=web.yml
或者本文中的注册服务的启动方式
vi /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/home/software/prometheus-2.53.3.linux-amd64/prometheus \
--config.file=/home/software/prometheus-2.53.3.linux-amd64/prometheus.yml \
--storage.tsdb.path=/home/software/prometheus-2.53.3.linux-amd64/data \
--web.console.templates=/home/software/prometheus-2.53.3.linux-amd64/consoles \
--web.console.libraries=/home/software/prometheus-2.53.3.linux-amd64/console_libraries \
--web.config.file=/home/software/prometheus-2.53.3.linux-amd64/web.yml
[Install]
WantedBy=multi-user.target
刷新、重启
systemctl daemon-reload
systemctl start prometheus
然后可以通过浏览器访问ip:port来检测是否开启了Auth basic登录
更新采集自己的采集任务
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
basic_auth:
username: 'prometheus'
#替换为自己的密码
password: '????????'
Grafana
官网
https://grafana.com/grafana/download
安装
官网安装教程
https://grafana.com/grafana/download
安装脚本
sudo yum install -y https://dl.grafana.com/enterprise/release/grafana-enterprise-11.4.0-1.x86_64.rpm
安装之后弹出提示
Running transaction
Installing : grafana-enterprise-11.4.0-1.x86_64 1/1
### 安装没有启动,通常情况下,安装过程中会自动配置服务以确保它能在系统启动时自动启动,但有时这一步骤可能会失败或者没有被执行。
### NOT starting on installation, please execute the following statements to configure grafana to start automatically using systemd
sudo /bin/systemctl daemon-reload
sudo /bin/systemctl enable grafana-server.service
### You can start grafana-server by executing
### 你可以根据以下命令启动grafana服务
sudo /bin/systemctl start grafana-server.service
POSTTRANS: Running script
Verifying : grafana-enterprise-11.4.0-1.x86_64 1/1
Installed:
grafana-enterprise.x86_64 0:11.4.0-1
Complete!
将设置为中文(部分界面未汉化完全)
编辑配置文件:
vi /usr/share/grafana/conf/defaults.ini
修改语言未zh-Hans
#default_language = en-US
default_language = zh-Hans
编辑之后需要重启
systemctl restart grafana-server.service