前言
这台vps主机购买后,一直想装个监控系统,今晚忙里偷闲,开搞~
考察了业内呼声甚高的skywalking, 虽然知道是好东西,但太重,一台实例犯不着折腾。
为了节省时间,直接上prometheus, 优点是简单易用,功能强大,方便grafana集成。
其中,prometheus和node_exporter均在宿主机安装,grafaner使用docker安装。
准备工作
- 创建用户和相关目录
1
2
3
4
5
6groupadd prometheus
useradd -M -s /sbin/nologin -g prometheus prometheus
mkdir /etc/prometheus
mkdir -p /opt/prometheus/data
chown -R prometheus:prometheus /etc/prometheus
chown -R /opt/prometheus
安装prometheus
登录地址: https://prometheus.io/download/, 找到文件 “prometheus-2.48.0-rc.0.linux-amd64.tar.gz”, 使用wget命令下载即可。
安装命令
1
2
3
4
5
6# 解压
tar -zxvf prometheus-2.48.0-rc.0.linux-amd64.tar.gz
cd prometheus-2.48.0-rc.0.linux-amd64
sudo cp prometheus /usr/local/bin/
chmod +x /usr/local/bin/prometheus
cp ./prometheus.yml /etc/prometheus/修改配置文件/etc/prometheus/prometheus.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9100"]设置systemd启动, 创建文件: /etc/systemd/system/prometheus.service
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Restart=on-failure
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/opt/prometheus/data \
--storage.tsdb.retention.time=30d
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target启动进程
1
2
3
4
5systemctl daemon-reload
systemctl start prometheus.service
systemctl status prometheus.service
# 如启动失败,查看错误信息
journalctl -xe
安装node_exporter
配置应用
1
2
3
4
5# 解压
tar -zxvf node_exporter-1.6.1.linux-amd64.tar.gz
cd node_exporter-1.6.1.linux-amd64
sudo cp node_exporter /usr/local/bin
sudo chmod +x /usr/local/bin/node_exporter配置systemd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18cat /etc/systemd/system/node_exporter.service
# 内容如下:
[Unit]
Description=node exporter service
Documentation=https://prometheus.io
After=network.target
[Service]
Type=simple
User=root
Group=root
ExecStart=/usr/local/bin/node_exporter
# 有特殊需求的可以在后面指定参数配置
Restart=on-failure
[Install]
WantedBy=multi-user.target重载和启动进程
1
2
3systemctl daemon-reload
systemctl start node_exporter
systemctl status node_exporter
安装grafana
使用docker安装过程
1
2
3
4# 拉取镜像
docker pull grafana/grafana
# 后台启动
docker run -d --name=grafana -p 3000:3000 grafana/grafana登录系统,配置数据源
grafana使用docker容器安装,要访问宿主机上的数据源。
执行命令ifconfig, 看到docker0网桥地址为172.17.0.1,即为grafana数据源的前缀地址。
配置grafana数据源
验证数据源
上传模板,自动生成dashboard