本文将介绍如何在一台 Linux 主机(如 CentOS 7)上,从零开始部署完整的监控系统,包括 Prometheus、Alertmanager、Node Exporter 和 Grafana。我们将使用 Supervisor
管理大部分组件的启动与运行,同时使用 Systemd
管理 Node Exporter 服务。
hostnamectl set-hostname monitor
df -HT # 检查磁盘空间
netstat -nltp # 查看监听端口
systemctl status firewalld # 检查防火墙状态
systemctl disable postfix --now # 禁用邮件服务
注释:生产环境需配置防火墙规则放行9100(Node Exporter)、9090(Prometheus)、9093(Alertmanager)、3000(Grafana)等端口。
mkdir -p /data/app /data/bag
/data/bag
:用于存放安装包/data/app
:用于存放解压后的应用目录yum -y install supervisor
systemctl enable supervisord.service --now
确认状态:
systemctl status supervisord.service
cd /data/bag
wget https://github.com/prometheus/prometheus/releases/download/v2.53.4/prometheus-2.53.4.linux-amd64.tar.gz
tar xf prometheus-2.53.4.linux-amd64.tar.gz
mv prometheus-2.53.4.linux-amd64 /data/app/prometheus
mkdir /data/app/prometheus/logs
/etc/supervisord.d/prometheus.ini
内容如下:
[program:prometheus]
directory=/data/app/prometheus
command=/data/app/prometheus/prometheus --config.file=/data/app/prometheus/prometheus.yml --storage.tsdb.path=/data/app/prometheus/data --storage.tsdb.retention=30d --web.console.libraries=/data/app/prometheus/console_libraries --web.console.templates=/data/app/prometheus/consoles --web.listen-address=0.0.0.0:9090 --web.external-url=http://:9090 --web.read-timeout=5m --web.max-connections=10 --query.max-concurrency=20 --query.timeout=2m --web.enable-lifecycle
autostart=true
startsecs=5
autorestart=true
startretries=10
redirect_stderr=true
stdout_logfile=/data/app/prometheus/logs/prometheus.log
stdout_logfile_maxbytes=20MB
stdout_logfile_backups=5
stderr_logfile=/data/app/prometheus/logs/err.log
stderr_logfile_maxbytes=20MB
stderr_logfile_backups=5
注意:将
替换为本机 IP 地址。
supervisorctl update
supervisorctl status
cd /data/bag
wget https://github.com/prometheus/alertmanager/releases/download/v0.28.1/alertmanager-0.28.1.linux-amd64.tar.gz
tar xf alertmanager-0.28.1.linux-amd64.tar.gz
mv alertmanager-0.28.1.linux-amd64 /data/app/alertmanager
mkdir /data/app/alertmanager/logs
/etc/supervisord.d/alertmanager.ini
内容如下:
[program:alertmanager]
command=/data/app/alertmanager/alertmanager --config.file=/data/app/alertmanager/alertmanager.yml --storage.path=/data/app/alertmanager/data --web.listen-address=:9093
autostart=true
startsecs=5
exitcodes=0
redirect_stderr=true
stdout_logfile=/data/app/alertmanager/logs/alertmanager.log
stdout_logfile_maxbytes=20MB
stdout_logfile_backups=5
stderr_logfile=/data/app/alertmanager/logs/err.log
stderr_logfile_maxbytes=20MB
stderr_logfile_backups=5
supervisorctl update
supervisorctl status
cd /data/bag
wget https://github.com/prometheus/node_exporter/releases/download/v1.9.1/node_exporter-1.9.1.linux-amd64.tar.gz
tar xf node_exporter-1.9.1.linux-amd64.tar.gz
mv node_exporter-1.9.1.linux-amd64 /data/app/node_exporter
/usr/lib/systemd/system/node_exporter.service
内容如下:
[Unit]
Description=node_exporter
After=network.target
[Service]
Type=simple
ExecStart=/data/app/node_exporter/node_exporter \
--web.listen-address=0.0.0.0:9100 \
--web.telemetry-path=/metrics \
--log.level=info \
--log.format=logfmt
Restart=always
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl enable node_exporter --now
netstat -nltp | grep 9100
cd /data/bag
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-12.0.0.linux-amd64.tar.gz
tar xf grafana-enterprise-12.0.0.linux-amd64.tar.gz
mv grafana-* /data/app/grafana
mkdir /data/app/grafana/logs
/etc/supervisord.d/grafana.ini
内容如下:
[program:grafana]
directory=/data/app/grafana
command=/data/app/grafana/bin/grafana-server -homepath=/data/app/grafana
autostart=true
startsecs=5
autorestart=true
startretries=10
redirect_stderr=true
stdout_logfile=/data/app/grafana/logs/grafana.log
stdout_logfile_maxbytes=20MB
stdout_logfile_backups=5
stderr_logfile=/data/app/grafana/logs/err.log
stderr_logfile_maxbytes=20MB
stderr_logfile_backups=5
supervisorctl update
supervisorctl status
默认访问地址:http://
默认账号密码:admin / admin
/data/app/prometheus/prometheus.yml
:global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node_exporter'
static_configs:
- targets: ['localhost:9100']
alerting:
alertmanagers:
- static_configs:
- targets:
- "localhost:9093"
服务 | 启动命令 | 访问地址 |
---|---|---|
Prometheus | supervisorctl start prometheus |
http://IP:9090 |
Alertmanager | supervisorctl start alertmanager |
http://IP:9093 |
Grafana | supervisorctl start grafana |
http://IP:3000 (默认admin/admin) |
Node Exporter | systemctl start node_exporter |
http://IP:9100/metrics |
至此,一个完整的 Prometheus 监控体系已完成部署:
通过 Supervisor 与 Systemd 结合,可实现进程稳定运行、自动启动与日志记录,适用于线上环境部署。