前言

这台vps主机购买后,一直想装个监控系统,今晚忙里偷闲,开搞~
考察了业内呼声甚高的skywalking, 虽然知道是好东西,但太重,一台实例犯不着折腾。
为了节省时间,直接上prometheus, 优点是简单易用,功能强大,方便grafana集成。
其中,prometheus和node_exporter均在宿主机安装,grafaner使用docker安装。

准备工作

  • 创建用户和相关目录
    1
    2
    3
    4
    5
    6
    groupadd prometheus
    useradd -M -s /sbin/nologin -g prometheus prometheus
    mkdir /etc/prometheus
    mkdir -p /opt/prometheus/data
    chown -R prometheus:prometheus /etc/prometheus
    chown -R /opt/prometheus

安装prometheus

  • 登录地址: https://prometheus.io/download/, 找到文件 “prometheus-2.48.0-rc.0.linux-amd64.tar.gz”, 使用wget命令下载即可。

  • 安装命令

    1
    2
    3
    4
    5
    6
    # 解压
    tar -zxvf prometheus-2.48.0-rc.0.linux-amd64.tar.gz
    cd prometheus-2.48.0-rc.0.linux-amd64
    sudo cp prometheus /usr/local/bin/
    chmod +x /usr/local/bin/prometheus
    cp ./prometheus.yml /etc/prometheus/
  • 修改配置文件/etc/prometheus/prometheus.yml

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    # my global config
    global:
    scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
    evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
    # scrape_timeout is set to the global default (10s).

    # Alertmanager configuration
    alerting:
    alertmanagers:
    - static_configs:
    - targets:
    # - alertmanager:9093

    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
    # - "first_rules.yml"
    # - "second_rules.yml"

    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
    # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
    - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ["localhost:9100"]
  • 设置systemd启动, 创建文件: /etc/systemd/system/prometheus.service

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    [Unit]
    Description=Prometheus Server
    Documentation=https://prometheus.io/docs/introduction/overview/
    After=network-online.target

    [Service]
    User=prometheus
    Group=prometheus
    Restart=on-failure
    ExecStart=/usr/local/bin/prometheus \
    --config.file=/etc/prometheus/prometheus.yml \
    --storage.tsdb.path=/opt/prometheus/data \
    --storage.tsdb.retention.time=30d
    ExecReload=/bin/kill -HUP $MAINPID

    [Install]
    WantedBy=multi-user.target
  • 启动进程

    1
    2
    3
    4
    5
    systemctl daemon-reload
    systemctl start prometheus.service
    systemctl status prometheus.service
    # 如启动失败,查看错误信息
    journalctl -xe

安装node_exporter

  • 配置应用

    1
    2
    3
    4
    5
    # 解压
    tar -zxvf node_exporter-1.6.1.linux-amd64.tar.gz
    cd node_exporter-1.6.1.linux-amd64
    sudo cp node_exporter /usr/local/bin
    sudo chmod +x /usr/local/bin/node_exporter
  • 配置systemd

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    cat /etc/systemd/system/node_exporter.service

    # 内容如下:
    [Unit]
    Description=node exporter service
    Documentation=https://prometheus.io
    After=network.target

    [Service]
    Type=simple
    User=root
    Group=root
    ExecStart=/usr/local/bin/node_exporter
    # 有特殊需求的可以在后面指定参数配置
    Restart=on-failure

    [Install]
    WantedBy=multi-user.target
  • 重载和启动进程

    1
    2
    3
    systemctl daemon-reload
    systemctl start node_exporter
    systemctl status node_exporter

安装grafana

  • 使用docker安装过程

    1
    2
    3
    4
    # 拉取镜像
    docker pull grafana/grafana
    # 后台启动
    docker run -d --name=grafana -p 3000:3000 grafana/grafana
  • 登录系统,配置数据源
    grafana使用docker容器安装,要访问宿主机上的数据源。
    执行命令ifconfig, 看到docker0网桥地址为172.17.0.1,即为grafana数据源的前缀地址。
    grfana-in-docker-find-datasource-ip
    配置grafana数据源
    prometheus-grafana-config-datasource
    验证数据源
    prometheus-grafana-validate-datasource
    上传模板,自动生成dashboard
    prometheus-grafana-grenerate-dashboard