安装kube-prometheus

发布时间：2020-12-25 10:41:06编辑：admin阅读（4199）

一、概述

简介

kube-prometheus 是一整套监控解决方案，它使用 Prometheus 采集集群指标，Grafana 做展示，包含如下组件：

The Prometheus Operator
Highly available Prometheus
Highly available Alertmanager
Prometheus node-exporter
Prometheus Adapter for Kubernetes Metrics APIs （k8s-prometheus-adapter）
kube-state-metrics
Grafana

二、安装

环境说明

操作系统：centos 7.6

k8s版本：1.18.1

ip地址：10.212.82.63

主机名：k8s-master

配置：2核2g

操作系统：centos 7.6

k8s版本：1.18.1

ip地址：10.212.82.65

主机名：k8s-node01

配置：2核8g

下载项目

以下步骤，请在k8s-master上执行。

安装git工具

yum install -y git

克隆kube-prometheus

git clone https://github.com/coreos/kube-prometheus

查看manifest

[root@k8s-master]# cd kube-prometheus/manifests/
[root@k8s-master manifests]# ll

输出：

总用量 1696
-rw-r--r-- 1 root root     405 12月 17 10:25 alertmanager-alertmanager.yaml
-rw-r--r-- 1 root root     964 12月 17 10:17 alertmanager-secret.yaml
-rw-r--r-- 1 root root      96 12月 17 10:17 alertmanager-serviceAccount.yaml
-rw-r--r-- 1 root root     254 12月 17 10:17 alertmanager-serviceMonitor.yaml
-rw-r--r-- 1 root root     326 12月 17 10:20 alertmanager-service.yaml
-rw-r--r-- 1 root root     550 12月 17 10:17 grafana-dashboardDatasources.yaml
-rw-r--r-- 1 root root 1403795 12月 17 10:17 grafana-dashboardDefinitions.yaml
-rw-r--r-- 1 root root     454 12月 17 10:17 grafana-dashboardSources.yaml
-rw-r--r-- 1 root root    7722 12月 17 10:17 grafana-deployment.yaml
-rw-r--r-- 1 root root      86 12月 17 10:17 grafana-serviceAccount.yaml
-rw-r--r-- 1 root root     208 12月 17 10:17 grafana-serviceMonitor.yaml
-rw-r--r-- 1 root root     273 12月 17 10:20 grafana-service.yaml
-rw-r--r-- 1 root root     376 12月 17 10:17 kube-state-metrics-clusterRoleBinding.yaml
-rw-r--r-- 1 root root    1651 12月 17 10:17 kube-state-metrics-clusterRole.yaml
-rw-r--r-- 1 root root    2127 12月 17 10:18 kube-state-metrics-deployment.yaml
-rw-r--r-- 1 root root     192 12月 17 10:17 kube-state-metrics-serviceAccount.yaml
-rw-r--r-- 1 root root     829 12月 17 10:17 kube-state-metrics-serviceMonitor.yaml
-rw-r--r-- 1 root root     403 12月 17 10:17 kube-state-metrics-service.yaml
-rw-r--r-- 1 root root     266 12月 17 10:17 node-exporter-clusterRoleBinding.yaml
-rw-r--r-- 1 root root     283 12月 17 10:17 node-exporter-clusterRole.yaml
-rw-r--r-- 1 root root    2880 12月 17 10:18 node-exporter-daemonset.yaml
-rw-r--r-- 1 root root      92 12月 17 10:17 node-exporter-serviceAccount.yaml
-rw-r--r-- 1 root root     669 12月 17 10:17 node-exporter-serviceMonitor.yaml
-rw-r--r-- 1 root root     315 12月 17 10:17 node-exporter-service.yaml
-rw-r--r-- 1 root root     292 12月 17 10:17 prometheus-adapter-apiService.yaml
-rw-r--r-- 1 root root     396 12月 17 10:17 prometheus-adapter-clusterRoleAggregatedMetricsReader.yaml
-rw-r--r-- 1 root root     304 12月 17 10:17 prometheus-adapter-clusterRoleBindingDelegator.yaml
-rw-r--r-- 1 root root     281 12月 17 10:17 prometheus-adapter-clusterRoleBinding.yaml
-rw-r--r-- 1 root root     188 12月 17 10:17 prometheus-adapter-clusterRoleServerResources.yaml
-rw-r--r-- 1 root root     219 12月 17 10:17 prometheus-adapter-clusterRole.yaml
-rw-r--r-- 1 root root    1378 12月 17 10:17 prometheus-adapter-configMap.yaml
-rw-r--r-- 1 root root    1333 12月 17 10:18 prometheus-adapter-deployment.yaml
-rw-r--r-- 1 root root     325 12月 17 10:17 prometheus-adapter-roleBindingAuthReader.yaml
-rw-r--r-- 1 root root      97 12月 17 10:17 prometheus-adapter-serviceAccount.yaml
-rw-r--r-- 1 root root     408 12月 17 10:17 prometheus-adapter-serviceMonitor.yaml
-rw-r--r-- 1 root root     236 12月 17 10:17 prometheus-adapter-service.yaml
-rw-r--r-- 1 root root     269 12月 17 10:17 prometheus-clusterRoleBinding.yaml
-rw-r--r-- 1 root root     216 12月 17 10:17 prometheus-clusterRole.yaml
-rw-r--r-- 1 root root     621 12月 17 10:17 prometheus-operator-serviceMonitor.yaml
-rw-r--r-- 1 root root     800 12月 17 10:25 prometheus-prometheus.yaml
-rw-r--r-- 1 root root     293 12月 17 10:17 prometheus-roleBindingConfig.yaml
-rw-r--r-- 1 root root     983 12月 17 10:17 prometheus-roleBindingSpecificNamespaces.yaml
-rw-r--r-- 1 root root     188 12月 17 10:17 prometheus-roleConfig.yaml
-rw-r--r-- 1 root root    1141 12月 17 10:17 prometheus-roleSpecificNamespaces.yaml
-rw-r--r-- 1 root root   99490 12月 17 10:17 prometheus-rules.yaml
-rw-r--r-- 1 root root      93 12月 17 10:17 prometheus-serviceAccount.yaml
-rw-r--r-- 1 root root    6821 12月 17 10:17 prometheus-serviceMonitorApiserver.yaml
-rw-r--r-- 1 root root     395 12月 17 10:17 prometheus-serviceMonitorCoreDNS.yaml
-rw-r--r-- 1 root root    6310 12月 17 10:17 prometheus-serviceMonitorKubeControllerManager.yaml
-rw-r--r-- 1 root root    7126 12月 17 10:17 prometheus-serviceMonitorKubelet.yaml
-rw-r--r-- 1 root root     485 12月 17 10:17 prometheus-serviceMonitorKubeScheduler.yaml
-rw-r--r-- 1 root root     247 12月 17 10:17 prometheus-serviceMonitor.yaml
-rw-r--r-- 1 root root     315 12月 17 10:19 prometheus-service.yaml
drwxr-xr-x 2 root root    4096 12月 17 10:18 setup

修改镜像源

国外镜像源某些镜像无法拉取，我们这里修改prometheus-operator，prometheus，alertmanager，kube-state-metrics，node-exporter，prometheus-adapter的镜像源为国内镜像源。我这里使用的是中科大的镜像源。

sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' setup/prometheus-operator-deployment.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' prometheus-prometheus.yaml 
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' alertmanager-alertmanager.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' kube-state-metrics-deployment.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' node-exporter-daemonset.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' prometheus-adapter-deployment.yaml

修改类型为NodePort

为了可以从外部访问prometheus，alertmanager，grafana，我们这里修改promethes，alertmanager，grafana的service类型为NodePort类型。

修改prometheus的service

cat prometheus-service.yaml

输出：

apiVersion: v1
kind: Service
metadata:
  labels:
    prometheus: k8s
  name: prometheus-k8s
  namespace: monitoring
spec:
  type: NodePort # 新增
  ports:
  - name: web
    port: 9090
    targetPort: web
    nodePort: 30090 # 新增
  selector:
    app: prometheus
    prometheus: k8s
  sessionAffinity: ClientIP

修改alertmanager的service

cat alertmanager-service.yaml

输出：

apiVersion: v1
kind: Service
metadata:
  labels:
    alertmanager: main
  name: alertmanager-main
  namespace: monitoring
spec:
  type: NodePort # 新增
  ports:
  - name: web
    port: 9093
    targetPort: web
    nodePort: 30093 # 新增
  selector:
    alertmanager: main
    app: alertmanager
  sessionAffinity: ClientIP

修改grafana的service

cat grafana-service.yaml

输出：

apiVersion: v1
kind: Service
metadata:
  labels:
    app: grafana
  name: grafana
  namespace: monitoring
spec:
  type: NodePort # 新增
  ports:
  - name: http
    port: 3000
    targetPort: http
    nodePort: 32000 # 新增
  selector:
    app: grafana
  type: NodePort

修改副本数量

默认alertmanager副本数为3，prometheus副本数为2。这是官方基于高可用考虑的，由于我的服务器配置比较差，开这么副本比较浪费性能。

因此需要将副本数，统一改为1

修改alertmanager

vi alertmanager-alertmanager.yaml

将 replicas: 3，改为replicas: 1

修改prometheus

vi prometheus-prometheus.yaml

将 replicas: 2，改为replicas: 1

安装kube-prometheus并确认状态

安装CRD和prometheus-operator

kubectl apply -f setup/

下载prometheus-operator镜像需要花费几分钟，这里等待几分钟，直到prometheus-operator变成running状态

kubectl get pod -n monitoring

输出：

NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-operator-5d96f4f876-tjgmp   2/2     Running   0          35m

安装prometheus, alertmanager, grafana, kube-state-metrics, node-exporter等资源

kubectl apply -f .

下载镜像比较花费时间，可以先去泡杯咖啡，等上半小时再回来，然后查看命名空间monitoring下面的pod状态，直到monitoring命名空间下所有pod都变为running状态，就大功告成了。

kubectl get pod -n monitoring

输出：

NAME                                   READY   STATUS    RESTARTS   AGE
alertmanager-main-0                    2/2     Running   0          30m
grafana-675dbb6748-sg6sk               1/1     Running   0          34m
kube-state-metrics-77bb8444b8-b9svz    3/3     Running   0          34m
node-exporter-pj4x9                    2/2     Running   0          34m
node-exporter-sw9zb                    2/2     Running   0          34m
prometheus-adapter-5dbb4cb95f-wwxb2    1/1     Running   0          34m
prometheus-k8s-0                       2/2     Running   1          34m
prometheus-operator-5d96f4f876-tjgmp   2/2     Running   0          35m