在 macOS 系统上搭建监控
Node Exporter: 系统指标收集器
Prometheus: 时序数据库和监控服务器
Grafana: 数据可视化和仪表板平台
统一使用 Homebrew 安装 - 简洁、快速、可靠
一键安装命令 - 三个组件一次性搞定
开箱即用 - 完整的配置模板和面板
自动化管理 - 服务自启动和状态监控
监控:
安装 Homebrew
# 1. 安装所有组件
brew install grafana prometheus node_exporter
# 2. 启动所有服务
brew services start grafana prometheus node_exporter
# 3. 验证安装
http://localhost:3000 # Grafana 面板
http://localhost:9090 # Prometheus 监控
http://localhost:9100
Homebrew 是 macOS 上最流行的包管理器
# 安装 Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# 验证安装
brew --version
# 更新 Homebrew 和包列表
brew update
Node Exporter 负责收集系统级别的指标数据。
# 安装 node_exporter
brew install node_exporter
# 启动服务
brew services start node_exporter
# 设置开机自启动
brew services enable node_exporter
# 检查服务状态
brew services list | grep node_exporter
# 测试访问指标
curl http://localhost:9100/metrics | head -20
# 检查进程
ps aux | grep node_exporter
Prometheus 是核心的监控服务器,负责收集和存储时序数据。
# 安装 Prometheus
brew install prometheus
# 查看安装信息
brew info prometheus
# 检查版本
prometheus --version
# 查看配置文件位置
ls -la /opt/homebrew/etc/prometheus.yml
# 或者(Intel 芯片)
ls -la /usr/local/etc/prometheus.yml
Grafana 提供强大的数据可视化和仪表板功能。
# 安装 Grafana
brew install grafana
# 启动服务
brew services start grafana
# 设置开机自启动
brew services enable grafana
# 检查服务状态
brew services list | grep grafana
# 访问 Web 界面
http://localhost:3000
默认登录信息:
admin
admin
一次性安装所有组件:
# 一次性安装所有监控组件
brew install grafana prometheus node_exporter
# 启动所有服务
brew services start grafana
brew services start prometheus
brew services start node_exporter
# 设置开机自启动
brew services enable grafana
brew services enable prometheus
brew services enable node_exporter
# 验证所有服务状态
brew services list | grep -E "(grafana|prometheus|node_exporter)"
# 备份原配置文件
cp /opt/homebrew/etc/prometheus.yml /opt/homebrew/etc/prometheus.yml.backup
# 编辑配置文件
nano /opt/homebrew/etc/prometheus.yml
# prometheus.yml
global:
scrape_interval: 15s # 全局抓取间隔
evaluation_interval: 15s # 规则评估间隔
scrape_timeout: 10s # 抓取超时
# 规则文件配置
rule_files:
- "/opt/homebrew/etc/prometheus/rules/*.yml"
# 抓取配置
scrape_configs:
# Prometheus 自身监控
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
scrape_interval: 5s
metrics_path: /metrics
# Node Exporter 监控
- job_name: 'node-exporter'
static_configs:
- targets: ['localhost:9100']
scrape_interval: 15s
metrics_path: /metrics
# 可选:添加其他服务器的监控
- job_name: 'remote-servers'
static_configs:
- targets:
- 'server1.example.com:9100'
- 'server2.example.com:9100'
scrape_interval: 30s
# 存储配置
storage:
tsdb:
path: /opt/homebrew/var/prometheus
retention.time: 15d
retention.size: 1GB
# Web 配置
web:
listen-address: 0.0.0.0:9090
max-connections: 512
read-timeout: 30s
# 创建规则目录
mkdir -p /opt/homebrew/etc/prometheus/rules
# 创建基本告警规则
cat > /opt/homebrew/etc/prometheus/rules/basic_alerts.yml << EOF
groups:
- name: basic_alerts
rules:
# 实例下线告警
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: critical
annotations:
summary: "实例 {{ \$labels.instance }} 已下线"
description: "{{ \$labels.instance }} 已经下线超过 5 分钟"
# 高 CPU 使用率告警
- alert: HighCpuUsage
expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "{{ \$labels.instance }} CPU 使用率过高"
description: "CPU 使用率超过 80% 超过 5 分钟"
# 高内存使用率告警
- alert: HighMemoryUsage
expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90
for: 5m
labels:
severity: warning
annotations:
summary: "{{ \$labels.instance }} 内存使用率过高"
description: "内存使用率超过 90% 超过 5 分钟"
# 磁盘空间不足告警
- alert: DiskSpaceLow
expr: (node_filesystem_avail_bytes{fstype!="tmpfs"} / node_filesystem_size_bytes{fstype!="tmpfs"}) * 100 < 10
for: 5m
labels:
severity: critical
annotations:
summary: "{{ \$labels.instance }} 磁盘空间不足"
description: "磁盘 {{ \$labels.device }} 可用空间少于 10%"
EOF
# 验证配置文件
prometheus --config.file=/opt/homebrew/etc/prometheus.yml --dry-run
# 启动 Prometheus 服务
brew services start prometheus
# 重启服务(如果已经在运行)
brew services restart prometheus
admin
/ admin
Prometheus
http://localhost:9090
Server (default)
# 配置邮件通知
1. 进入 "Alerting" → "Notification channels"
2. 点击 "New Channel"
3. 选择通知类型(Email, Slack, Webhook 等)
4. 填写相关配置信息
# Node Exporter 经典面板
1860 # Node Exporter Full(最受欢迎)
11074 # Node Exporter for Prometheus Dashboard
405 # Node Exporter Server Metrics
# macOS 专用面板
15797 # Node Exporter Mac OSX
12486 # Node Exporter macOS Dashboard
创建简单的 macOS 监控面板:
{
"dashboard": {
"id": null,
"title": "macOS 系统监控",
"tags": ["macos", "node-exporter"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "CPU 使用率",
"type": "stat",
"targets": [
{
"expr": "100 - (avg by(instance) (irate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)",
"legendFormat": "CPU 使用率 %"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent",
"min": 0,
"max": 100,
"thresholds": {
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 70},
{"color": "red", "value": 90}
]
}
}
},
"gridPos": {"h": 8, "w": 6, "x": 0, "y": 0}
},
{
"id": 2,
"title": "内存使用率",
"type": "stat",
"targets": [
{
"expr": "(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100",
"legendFormat": "内存使用率 %"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent",
"min": 0,
"max": 100,
"thresholds": {
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 70},
{"color": "red", "value": 90}
]
}
}
},
"gridPos": {"h": 8, "w": 6, "x": 6, "y": 0}
}
],
"time": {
"from": "now-1h",
"to": "now"
},
"refresh": "30s"
}
}
# 检查所有相关服务
brew services list | grep -E "(prometheus|grafana|node_exporter)"
# 检查端口占用
lsof -i :9090 # Prometheus
lsof -i :3000 # Grafana
lsof -i :9100 # Node Exporter
访问 http://localhost:9090/targets 确保所有目标都是 “UP” 状态。
# 检查端口是否被占用
lsof -i :9100
# 查看错误日志
tail -f /var/log/node_exporter.log
# 重启服务
brew services restart node_exporter
# 验证配置文件语法
prometheus --config.file=/opt/homebrew/etc/prometheus.yml --dry-run
# 检查规则文件
promtool check rules /opt/homebrew/etc/prometheus/rules/*.yml
# 查看日志
tail -f /opt/homebrew/var/log/prometheus.log
# 测试 Prometheus 连接
curl http://localhost:9090/api/v1/query?query=up
# 检查 Grafana 数据源配置
# 确保 URL 正确:http://localhost:9090
# 检查网络连接和防火墙设置
# 检查查询语句
# 确认时间范围设置
# 验证数据源选择正确
# 检查标签匹配
# 修复配置文件权限
sudo chown $(whoami) /opt/homebrew/etc/prometheus.yml
sudo chmod 644 /opt/homebrew/etc/prometheus.yml
# 修复数据目录权限
sudo chown -R $(whoami) /opt/homebrew/var/prometheus
sudo chown -R $(whoami) /opt/homebrew/var/grafana
# 安装 Alertmanager
brew install alertmanager
# 启动服务
brew services start alertmanager
# 在 prometheus.yml 中添加
remote_write:
- url: "http://remote-storage:9201/write"
queue_config:
max_samples_per_send: 1000
max_shards: 200
capacity: 2500
# 启动 Prometheus 时指定保留期
prometheus \
--storage.tsdb.retention.time=30d \
--storage.tsdb.retention.size=10GB
# 在 prometheus.yml 中添加
web:
tls_config:
cert_file: server.crt
key_file: server.key
# 调整抓取参数
global:
scrape_interval: 30s # 减少抓取频率
scrape_timeout: 10s # 优化超时时间
# 优化存储
storage:
tsdb:
min-block-duration: 2h
max-block-duration: 24h
# 配置防火墙
sudo pfctl -e
sudo pfctl -f /etc/pf.conf
# 限制访问IP
# 在生产环境中不要使用默认密码
# 配置HTTPS访问
# 定期更新软件版本
在 macOS 上搭建监控:
Node Exporter (端口 9100)
Prometheus (端口 9090)
Grafana (端口 3000)
通过统一使用 Homebrew 安装
brew upgrade
轻松更新所有组件brew uninstall
完全清理brew services
统一管理所有服务Node Exporter: http://localhost:9100/metrics
Prometheus: http://localhost:9090
Grafana: http://localhost:3000
# 查看所有服务状态
brew services list | grep -E "(prometheus|grafana|node_exporter)"
# 一键启动所有服务
brew services start grafana prometheus node_exporter
# 一键重启所有服务
brew services restart grafana prometheus node_exporter
# 一键停止所有服务
brew services stop grafana prometheus node_exporter
# 更新所有组件
brew upgrade grafana prometheus node_exporter
# 查看日志
tail -f /opt/homebrew/var/log/prometheus.log
tail -f /opt/homebrew/var/log/grafana/grafana.log
添加更多 Exporter
# 数据库监控
brew install mysqld_exporter
# Web服务器监控
brew install nginx-prometheus-exporter
# 网络探测
brew install blackbox_exporter
集成其他服务
高级功能
# 安装所有组件
brew install grafana prometheus node_exporter
# 启动所有服务
brew services start grafana prometheus node_exporter
# 设置开机自启动
brew services enable grafana prometheus node_exporter
# 查看服务状态
brew services list | grep -E "(grafana|prometheus|node_exporter)"
# 启动服务
brew services start [service_name]
# 停止服务
brew services stop [service_name]
# 重启服务
brew services restart [service_name]
# 禁用开机启动
brew services disable [service_name]
服务 | 地址 | 用途 |
---|---|---|
Grafana | http://localhost:3000 | 数据可视化面板 |
Prometheus | http://localhost:9090 | 监控服务器界面 |
Node Exporter | http://localhost:9100/metrics | 系统指标数据 |
Prometheus Targets | http://localhost:9090/targets | 监控目标状态 |
Prometheus Config | http://localhost:9090/config | 配置查看 |
# Apple Silicon (M1/M2)
/opt/homebrew/etc/prometheus.yml
/opt/homebrew/etc/grafana/grafana.ini
# Intel 芯片
/usr/local/etc/prometheus.yml
/usr/local/etc/grafana/grafana.ini
1860 # Node Exporter Full(最受欢迎)
15797 # Node Exporter Mac OSX(macOS 专用)
11074 # Node Exporter for Prometheus Dashboard
405 # Node Exporter Server Metrics
# 检查端口占用
lsof -i :3000 # Grafana
lsof -i :9090 # Prometheus
lsof -i :9100 # Node Exporter
# 查看日志
tail -f /opt/homebrew/var/log/grafana/grafana.log
tail -f /opt/homebrew/var/log/prometheus.log
# 验证配置
prometheus --config.file=/opt/homebrew/etc/prometheus.yml --dry-run
# 更新所有组件
brew upgrade grafana prometheus node_exporter
# 完全卸载
brew services stop grafana prometheus node_exporter
brew uninstall grafana prometheus node_exporter