上传文件至 kubernetes-MD

This commit is contained in:
wxin 2024-08-11 21:19:49 +08:00
parent 8830676384
commit 6b1bd30fc7
3 changed files with 1852 additions and 0 deletions

View File

@ -0,0 +1,215 @@
<h1><center>基于kubeadm部署kubernetes集群</center></h1>
------
## 一:环境准备
三台服务器一台master两台node,master节点必须是2核cpu
| 节点名称 | IP地址 |
| :------: | :--------: |
| master | 10.0.0.220 |
| node-1 | 10.0.0.221 |
| node-2 | 10.0.0.222 |
| node-3 | 10.0.0.223 |
#### 1.所有服务器关闭防火墙和selinux
```shell
[root@localhost ~]# systemctl stop firewalld
[root@localhost ~]# systemctl disable firewalld
[root@localhost ~]# setenforce 0
[root@localhost ~]# sed -i '/^SELINUX=/c SELINUX=disabled/' /etc/selinux/config
[root@localhost ~]# swapoff -a  临时关闭
[root@localhost ~]# sed -i 's/.*swap.*/#&/' /etc/fstab 永久关闭
注意:
关闭所有服务器的交换分区
所有节点操作
```
#### 2.保证yum仓库可用
```shell
[root@localhost ~]# yum clean all
[root@localhost ~]# yum makecache fast
注意:
使用国内yum源
所有节点操作
```
#### 3.修改主机名
```shell
[root@localhost ~]# hostnamectl set-hostname master
[root@localhost ~]# hostnamectl set-hostname node-1
[root@localhost ~]# hostnamectl set-hostname node-2
[root@localhost ~]# hostnamectl set-hostname node-3
注意:
所有节点操作
```
#### 4.添加本地解析
```shell
[root@master ~]# cat >> /etc/hosts <<eof
10.0.0.220 master
10.0.0.221 node-1
10.0.0.222 node-2
10.0.0.223 node-3
eof
注意:
所有节点操作
```
#### 5.安装容器
```shell
[root@master ~]# yum install -y yum-utils device-mapper-persistent-data lvm2
[root@master ~]# yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
[root@master ~]# yum -y install docker-ce
[root@master ~]# systemctl start docker
[root@master ~]# systemctl enable docker
注意:
所有节点操作
```
#### 6.安装kubeadm和kubelet
```shell
[root@master ~]# cat >> /etc/yum.repos.d/kubernetes.repo <<eof
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
eof
[root@master ~]# yum -y install kubeadm kubelet kubectl ipvsadm
注意:
所有节点操作
这里安装的是最新版本也可以指定版本号kubeadm-1.19.4
```
#### 7.配置kubelet的cgroups
```shell
[root@master ~]# cat >/etc/sysconfig/kubelet<<EOF
KUBELET_EXTRA_ARGS="--cgroup-driver=cgroupfs --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1"
EOF
k8s.gcr.io/pause:3.6
```
#### 8.加载内核模块
```shell
[root@master ~]# modprobe br_netfilter
注意:
所有节点操作
```
#### 9.修改内核参数
```shell
[root@master ~]# cat >> /etc/sysctl.conf <<eof
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
eof
[root@master ~]# sysctl -p
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness = 0
注意:
所有节点操作
```
## 二部署Kubernetes
#### 1.镜像下载
```shell
https://www.xingdiancloud.cn/index.php/s/6GyinxZwSRemHPz
注意:
下载后上传到所有节点
```
#### 2.镜像导入
```shell
[root@master ~]# cat image_load.sh
#!/bin/bash
image_path=`pwd`
for i in `ls "${image_path}" | grep tar`
do
docker load < $i
done
[root@master ~]# bash image_load.sh
注意:
所有节点操作
```
#### 3.master节点初始化
```shell
[root@master ~]# kubeadm init --kubernetes-version=1.23.1 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.0.0.220
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.0.0.220:6443 --token mzrm3c.u9mpt80rddmjvd3g \
--discovery-token-ca-cert-hash sha256:fec53dfeacc5187d3f0e3998d65bd3e303fa64acd5156192240728567659bf4a
```
#### 4.安装pod插件
```shell
[root@master ~]# wget http://www.xingdiancloud.cn:92/index.php/s/3Ad7aTxqPPja24M/download/flannel.yaml
[root@master ~]# kubectl create -f flannel.yaml
```
#### 5.将node加入工作节点
```shell
[root@node-1 ~]# kubeadm join 10.0.0.220:6443 --token mzrm3c.u9mpt80rddmjvd3g --discovery-token-ca-cert-hash sha256:fec53dfeacc5187d3f0e3998d65bd3e303fa64acd5156192240728567659bf4a
注意:
这里使用的是master初始化产生的token
这里的token时间长了会改变需要使用命令获取见下期内容
没有记录集群 join 命令的可以通过以下方式重新获取:
kubeadm token create --print-join-command --ttl=0
```
#### 6.master节点查看集群状态
```shell
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 26m v1.23.1
node-1 Ready <none> 4m45s v1.23.1
node-2 Ready <none> 4m40s v1.23.1
node-3 Ready <none> 4m46s v1.23.1
```

View File

@ -0,0 +1,953 @@
<h1><center>基于kubeadm部署kubernetes集群</center></h1>
------
## 一:环境准备
三台服务器一台master两台node,master节点必须是2核cpu
| 节点名称 | IP地址 |
| :------: | :--------: |
| master | 10.0.0.220 |
| node-1 | 10.0.0.221 |
| node-2 | 10.0.0.222 |
| node-3 | 10.0.0.223 |
#### 1.所有服务器关闭防火墙和selinux
```shell
[root@localhost ~]# systemctl stop firewalld
[root@localhost ~]# systemctl disable firewalld
[root@localhost ~]# setenforce 0
[root@localhost ~]# sed -i '/^SELINUX=/c SELINUX=disabled/' /etc/selinux/config
[root@localhost ~]# swapoff -a  临时关闭
[root@localhost ~]# sed -i 's/.*swap.*/#&/' /etc/fstab 永久关闭
注意:
关闭所有服务器的交换分区
所有节点操作
```
#### 2.保证yum仓库可用
```shell
[root@localhost ~]# yum clean all
[root@localhost ~]# yum makecache fast
注意:
使用国内yum源
所有节点操作
```
#### 3.修改主机名
```shell
[root@localhost ~]# hostnamectl set-hostname master
[root@localhost ~]# hostnamectl set-hostname node-1
[root@localhost ~]# hostnamectl set-hostname node-2
[root@localhost ~]# hostnamectl set-hostname node-3
注意:
所有节点操作
```
#### 4.添加本地解析
```shell
[root@master ~]# cat >> /etc/hosts <<eof
10.0.0.220 master
10.0.0.221 node-1
10.0.0.222 node-2
10.0.0.223 node-3
eof
注意:
所有节点操作
```
#### 5.安装容器运行时
```shell
第一步Installing containerd
链接地址https://github.com/containerd/containerd/blob/main/docs/getting-started.md
[root@master ~]# wget https://github.com/containerd/containerd/releases/download/v1.6.8/containerd-1.6.8-linux-amd64.tar.gz
[root@master ~]# tar xf containerd-1.6.8-linux-amd64.tar.gz
[root@master ~]# cp bin/* /usr/local/bin
创建systemctl管理服务启动文件
/usr/local/lib/systemd/system/containerd.service
/etc/systemd/system/multi-user.target.wants/containerd.service
[root@master ~]# cat containerd.service
# Copyright The containerd Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target
[Service]
#uncomment to enable the experimental sbservice (sandboxed) version of containerd/cri integration
#Environment="ENABLE_CRI_SANDBOXES=sandboxed"
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
[root@master ~]# systemctl daemon-reload
[root@master ~]# systemctl start containerd
第二步Installing runc
链接地址https://github.com/opencontainers/runc/releases
[root@master ~]# wget https://github.com/opencontainers/runc/releases/download/v1.1.4/runc.amd64
[root@master ~]# install -m 755 runc.amd64 /usr/local/sbin/runc
第三步Installing CNI plugins
链接地址https://github.com/containernetworking/plugins/releases
[root@master ~]# wget https://github.com/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-amd64-v1.1.1.tgz
[root@master ~]# mkdir -p /opt/cni/bin
[root@master ~]# tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.1.1.tgz
./
./macvlan
./static
./vlan
./portmap
./host-local
./vrf
./bridge
./tuning
./firewall
./host-device
./sbr
./loopback
./dhcp
./ptp
./ipvlan
./bandwidth
注意:
所有节点操作
```
#### 6.安装kubeadm和kubelet
```shell
国外仓库:
[root@master ~]# cat >> /etc/yum.repos.d/kubernetes.repo <<eof
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
enabled=1
gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
eof
[root@master ~]# yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
国内仓库:
[root@master ~]# cat >> /etc/yum.repos.d/kubernetes.repo <<eof
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
eof
[root@master ~]# yum -y install kubeadm kubelet kubectl ipvsadm
注意:
所有节点操作
这里安装的是最新版本也可以指定版本号kubeadm-1.25.0
修改kubelet配置
[root@master ~]# cat /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.6"
[root@master ~]# cat /etc/systemd/system/multi-user.target.wants/kubelet.service
[Unit]
Description=kubelet: The Kubernetes Node Agent
Documentation=https://kubernetes.io/docs/
Wants=network-online.target
After=network-online.target
[Service]
ExecStart=/usr/bin/kubelet \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run//containerd/containerd.sock
Restart=always
StartLimitInterval=0
RestartSec=10
[Install]
WantedBy=multi-user.target
```
#### 7.加载内核模块
```shell
[root@master ~]# modprobe br_netfilter
注意:
所有节点操作
```
#### 8.修改内核参数
```shell
[root@master ~]# cat >> /etc/sysctl.conf <<eof
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
net.ipv4.ip_forward=1
eof
[root@master ~]# sysctl -p
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness = 0
注意:
所有节点操作
```
## 二部署Kubernetes
#### 1.镜像下载
```shell
[root@master ~]# cat image.sh
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.9.3
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.25.0
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.25.0
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.25.0
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.25.0
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.8
#flannel镜像
ctr -n k8s.io i pull docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
ctr -n k8s.io i pull docker.io/rancher/mirrored-flannelcni-flannel:v0.19.2
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0 registry.k8s.io/etcd:3.5.4-0
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.9.3 registry.k8s.io/coredns/coredns:v1.9.3
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.8 k8s.gcr.io/pause:3.6
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.25.0 registry.k8s.io/kube-proxy:v1.25.0
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.25.0 registry.k8s.io/kube-scheduler:v1.25.0
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.25.0 registry.k8s.io/kube-controller-manager:v1.25.0
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.25.0 registry.k8s.io/kube-apiserver:v1.25.0
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.8 registry.k8s.io/pause:3.8
[root@master ~]# bash image.sh
注意:
所有节点操作
注意:
[root@master ~]# kubeadm config images list //获取所需要的镜像
registry.k8s.io/kube-apiserver:v1.25.0
registry.k8s.io/kube-controller-manager:v1.25.0
registry.k8s.io/kube-scheduler:v1.25.0
registry.k8s.io/kube-proxy:v1.25.0
registry.k8s.io/pause:3.8
registry.k8s.io/etcd:3.5.4-0
registry.k8s.io/coredns/coredns:v1.9.3
```
#### 3.master节点初始化
```shell
[root@master ~]# kubeadm init --kubernetes-version=1.23.1 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.0.0.220
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.0.2.150:6443 --token tlohed.bwgh4tt5erbc2zhg \
--discovery-token-ca-cert-hash sha256:9b28238c55bf5b2e5e0d68c6303e50bf6f12e7e07126897c846a4f5328157e16
```
#### 4.安装pod插件
```shell
[root@master ~]# vi kube-flannel.yaml
---
kind: Namespace
apiVersion: v1
metadata:
name: kube-flannel
labels:
pod-security.kubernetes.io/enforce: privileged
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-flannel
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-flannel
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-flannel
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni-plugin
#image: flannelcni/flannel-cni-plugin:v1.1.0 for ppc64le and mips64le (dockerhub limitations may apply)
image: docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
command:
- cp
args:
- -f
- /flannel
- /opt/cni/bin/flannel
volumeMounts:
- name: cni-plugin
mountPath: /opt/cni/bin
- name: install-cni
#image: flannelcni/flannel:v0.19.2 for ppc64le and mips64le (dockerhub limitations may apply)
image: docker.io/rancher/mirrored-flannelcni-flannel:v0.19.2
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
#image: flannelcni/flannel:v0.19.2 for ppc64le and mips64le (dockerhub limitations may apply)
image: docker.io/rancher/mirrored-flannelcni-flannel:v0.19.2
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EVENT_QUEUE_DEPTH
value: "5000"
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
- name: xtables-lock
mountPath: /run/xtables.lock
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni-plugin
hostPath:
path: /opt/cni/bin
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
[root@master ~]# kubectl create -f kube-flannel.yaml
```
#### 5.将node加入工作节点
```shell
[root@node-1 ~]# kubeadm join 10.0.0.220:6443 --token mzrm3c.u9mpt80rddmjvd3g --discovery-token-ca-cert-hash sha256:fec53dfeacc5187d3f0e3998d65bd3e303fa64acd5156192240728567659bf4a
注意:
这里使用的是master初始化产生的token
这里的token时间长了会改变需要使用命令获取见下期内容
没有记录集群 join 命令的可以通过以下方式重新获取:
kubeadm token create --print-join-command --ttl=0
```
#### 6.master节点查看集群状态
```shell
[root@master ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready control-plane 7m v1.25.0
node-1 Ready <none> 2m7s v1.25.0
node-2 Ready <none> 50s v1.25.0
node-3 Ready <none> 110s v1.25.0
```
## 三部署Dashboard
#### 1.kube-proxy 开启 ipvs
```shell
[root@master ~]# kubectl get configmap kube-proxy -n kube-system -o yaml > kube-proxy-configmap.yaml
[root@master ~]# sed -i 's/mode: ""/mode: "ipvs"/' kube-proxy-configmap.yaml
[root@master ~]# kubectl apply -f kube-proxy-configmap.yaml
[root@master ~]# rm -f kube-proxy-configmap.yaml
[root@master ~]# kubectl get pod -n kube-system | grep kube-proxy | awk '{system("kubectl delete pod "$1" -n kube-system")}'
```
#### 2.Dashboard安装脚本
```shell
[root@master ~]# cat dashboard.yaml
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
apiVersion: v1
kind: Namespace
metadata:
name: kubernetes-dashboard
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
type: NodePort
ports:
- port: 443
targetPort: 8443
selector:
k8s-app: kubernetes-dashboard
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-certs
namespace: kubernetes-dashboard
type: Opaque
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-csrf
namespace: kubernetes-dashboard
type: Opaque
data:
csrf: ""
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-key-holder
namespace: kubernetes-dashboard
type: Opaque
---
kind: ConfigMap
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-settings
namespace: kubernetes-dashboard
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
rules:
# Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"]
verbs: ["get", "update", "delete"]
# Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kubernetes-dashboard-settings"]
verbs: ["get", "update"]
# Allow Dashboard to get metrics.
- apiGroups: [""]
resources: ["services"]
resourceNames: ["heapster", "dashboard-metrics-scraper"]
verbs: ["proxy"]
- apiGroups: [""]
resources: ["services/proxy"]
resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"]
verbs: ["get"]
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
rules:
# Allow Metrics Scraper to get metrics from the Metrics server
- apiGroups: ["metrics.k8s.io"]
resources: ["pods", "nodes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: kubernetes-dashboard
image: kubernetesui/dashboard:v2.6.1
imagePullPolicy: Always
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
- --namespace=kubernetes-dashboard
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTPS
path: /
port: 8443
initialDelaySeconds: 30
timeoutSeconds: 30
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 1001
runAsGroup: 2001
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
nodeSelector:
"kubernetes.io/os": linux
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
---
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: dashboard-metrics-scraper
name: dashboard-metrics-scraper
namespace: kubernetes-dashboard
spec:
ports:
- port: 8000
targetPort: 8000
selector:
k8s-app: dashboard-metrics-scraper
---
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: dashboard-metrics-scraper
name: dashboard-metrics-scraper
namespace: kubernetes-dashboard
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: dashboard-metrics-scraper
template:
metadata:
labels:
k8s-app: dashboard-metrics-scraper
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: dashboard-metrics-scraper
image: kubernetesui/metrics-scraper:v1.0.8
ports:
- containerPort: 8000
protocol: TCP
livenessProbe:
httpGet:
scheme: HTTP
path: /
port: 8000
initialDelaySeconds: 30
timeoutSeconds: 30
volumeMounts:
- mountPath: /tmp
name: tmp-volume
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 1001
runAsGroup: 2001
serviceAccountName: kubernetes-dashboard
nodeSelector:
"kubernetes.io/os": linux
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
volumes:
- name: tmp-volume
emptyDir: {}
```
#### 3.创建证书
```shell
[root@k8s-master ~]# mkdir dashboard-certs
[root@k8s-master ~]# cd dashboard-certs/
#创建命名空间
[root@k8s-master ~]# kubectl create namespace kubernetes-dashboard
# 创建私钥key文件
[root@k8s-master ~]# openssl genrsa -out dashboard.key 2048
#证书请求
[root@k8s-master ~]# openssl req -days 36000 -new -out dashboard.csr -key dashboard.key -subj '/CN=dashboard-cert'
#自签证书
[root@k8s-master ~]# openssl x509 -req -in dashboard.csr -signkey dashboard.key -out dashboard.crt
#创建kubernetes-dashboard-certs对象
[root@k8s-master ~]# kubectl create secret generic kubernetes-dashboard-certs --from-file=dashboard.key --from-file=dashboard.crt -n kubernetes-dashboard
```
#### 4.创建管理员
```shell
创建账户
[root@k8s-master ~]# vim dashboard-admin.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: dashboard-admin
namespace: kubernetes-dashboard
#保存退出后执行
[root@k8s-master ~]# kubectl create -f dashboard-admin.yaml
为用户分配权限
[root@k8s-master ~]# vim dashboard-admin-bind-cluster-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dashboard-admin-bind-cluster-role
labels:
k8s-app: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: dashboard-admin
namespace: kubernetes-dashboard
#保存退出后执行
[root@k8s-master ~]# kubectl create -f dashboard-admin-bind-cluster-role.yaml
```
#### 5.安装 Dashboard
```shell
[root@master dashboard-certs]# kubectl create -f dashboard.yaml
```
#### 6.获取token
```shell
[root@master dashboard-certs]# kubectl -n kubernetes-dashboard create token dashboard-admin
eyJhbGciOiJSUzI1NiIsImtpZCI6InlBck13aTFMR2daR3htTmxSdG5XbGJjOVFLWmdMZlgzRU10TmJWRFNEMk0ifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjYyMjE0MTA3LCJpYXQiOjE2NjIyMTA1MDcsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJkYXNoYm9hcmQtYWRtaW4iLCJ1aWQiOiIwOTRhYWI2NC05NTkyLTRjYTctOWI3MS0yNDEwMmI5ODA1YjcifX0sIm5iZiI6MTY2MjIxMDUwNywic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmVybmV0ZXMtZGFzaGJvYXJkOmRhc2hib2FyZC1hZG1pbiJ9.MQfc83l08PsAqmBHRzwgE_TsZGct0Sul1lM7Ks0ssf29DtXt22u9hHisvaLmQ64sNsvb_D7r47kDDTxZPbIJP_A2mBuHFqy_dnryUmqlTj7KFJm4PdObwiMTlnBch-v7HqxJKLuA6XXLxtpNrbLWqqG47Bc2kvvcF4BzSkiDhe-s5L0PS-WY753QjV0C9v63G8KJDxkQEGVC4PjqfXSclLi_jvIe4n3UqhUNHPl85JWgBhJHTTAei3Ztp7IMweztR_P30p6BiXEF0Kmcv8Nb7Xsk2dx5avYyiTRZTpq4pBkvAMKlCbXyKufh78mil_oNdaA8Q_AeFWFwgDx9UrGoFA
```
#### 7.创建kubeconfig
```shell
获取certificate-authority-data
[root@master ~]# CURRENT_CONTEXT=$(kubectl config current-context)
[root@master ~]# CURRENT_CLUSTER=$(kubectl config view --raw -o=go-template='{{range .contexts}}{{if eq .name "'''${CURRENT_CONTEXT}'''"}}{{ index .context "cluster" }}{{end}}{{end}}')
[root@master ~]# kubectl config view --raw -o=go-template='{{range .clusters}}{{if eq .name "'''${CURRENT_CLUSTER}'''"}}"{{with index .cluster "certificate-authority-data" }}{{.}}{{end}}"{{ end }}{{ end }}'
"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1Ea3dNekEzTVRBeE0xb1hEVE15TURnek1UQTNNVEF4TTFvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTVNZCnRxdjVwdlVaQVN3VHl0ODkrWGphM1ZuZFZnaWVCbFc1bEZVc2dzSklxa0tSNlV2cjVYcXEvWjNOaUVpUlBqT28KWkh4a1V5SWpqdUFTUXZuYzhrTXhvNjNQY3d2UUNEYzd3V1pQeVBxMDVobFZhUlhYK0hHNjlaRXozYkQrUmlObgpyTU5uSVZqeEI0ck56SWs0cGFUNjBZMU5hdWx0V01NbEFyMFM3ZC9YQ3ZMeVhBK0NCNVFmZ2xSQTFJZnJ3ZjNJCno3YS9iQ0M4Qk9Fak94QmllRCtra3JYWGJtdXlMUHpTZkdKUGNUajI1eGdjK2RvNDJZKzZ4UUVCK0ZTSnN6VWIKVzhyMkx5TkI2YjNaZlBZcjNIMXQ4RkxkeUxtTU9nR1M2RkpPMmpQWVVWR0RObURLUHlPZWJMVit4UXlvMW4rMQpYK0F5NzJ1b2JlbklESE54czhzQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZMVXJSUnZXN2VrRlJhajRFRDFuWmVONXJzNVRNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQ2lucC9OakRCOGhLbFhuMzJCTQorV0hLcTRTOHNCTFBFeFhJSzloUXdWNmVodWgyOEQzSEltOFlFUCtJazEybDUwNi90QlNpYllOYjV1dHYyVnBmCmltUEh2aFAvUzZmTFE4MXVIL2JaQytaMlJ3b3VvLzU4TkJoZVRhR2ozK2VXTzNnMDllaEhaaHVFajE4WWVtaDYKU0xhUU9SZWE2dEpHVjNlVURWUk44Tnc1aXp2T3AxZ2poVHdsNzJTL3JycmxoREo2dGM4VDlPaUFhOWNkeVlPWApLNVZNdlEwRWw3aDlLT3lmN3FsSWgxL0dhSWdqVUl4Z3FNQ1lIallKc0Jvd2g2eDB5d0tSYllJQkl4d1M0NDNECnRmYU5wOTBpUFVGS0Q3c2IvTWxGeDZpK3l3UjVnQUd3NWJWSEdIZTMrY0szNzlRd1R2NS8zdWlYQTlBUnhiVloKS0hNPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg=="
[root@master ~]# cat config
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1ekNDQWMrZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeE1EVXhNakV3TXpReU5Gb1hEVE14TURVeE1ERXdNelF5TkZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTFd5CmFZLy85Nlc5R2hxWWVBVER5cXhQV2tmVE1mSUxNMFkydy9RSm9SRDJWZmw1cjFNR25mRWNteG81bXc4Z1NXMmEKdVNmeml6dU41eDRMblBBYlVvbHdubkxMeWx3cENmTkRKMFRUVFFyOThuTmphWWQ0d2RmK09uZmtZQ1VaVG1NTwpYWDZBMEZJblFHTEpWQUdOb0xUUnVIR3F6dU0yNUd1Rkp5aXBoNlhRN0tHcmJFVFo0RXQyVWg3azV2UGxPVDQ3CkNEQjlDVkMra1c5MkdIRmNrTkViUU9kTUpPTkZXalh1K1lsSjlZdzNybzhJYU05QVR5SDFwNmNzaXNoejhybVQKUGZEUkl4cXpsNTVzenJNUHV5Y0JqZkVEZXFhcjQ2OXJyMFFGcWJ5NTdaaEtBcGMybTA0eTZ6ZllUQlB4cDhndwpXZ01ONktkcjU4bEFpOERwN3kwQ0F3RUFBYU5DTUVBd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZERTBCTVRreE5BYlpmMGZ6S2JwRm5ZUU94aXpNQTBHQ1NxR1NJYjMKRFFFQkN3VUFBNElCQVFCdE1yU281K2JlcEF2UGpwOUJ4TDlncnFtTnpOVEczRWFHZnZKUndoUE5qVXZ1bDZFZQpNenlIM3o3SERzYW5kSHJMU2xYMGZXZ2Y3TFRBSVRLV3duTDI3NjVzaTJ4L0prb2pCV0VySytGTEs1dGZXbmdmCnZ4cE13eE83RVNYd0FwVDdKWk9iVVA0eStPc3k2VzhQQXRjeHFGUHdyaVNhM29KYnZwZEFvcHloZXdoNUxNcUwKWkpobU4wK1RhTnlOaVJXaEkwcnZOSGYrNEdtcjhMd2lzMXZPdGZ4b2FtcGhPWE8wS0NheEJ0MWoxcDQxK2pNTwo0UUhlRWtFTkJndERzMUtuMTRhbFR6NWI1cFg0amwrU2tOOFk0ZlppUk9SRXJ4cWVmVkJEZ1Z0aWtQbEp5VFkyCmRQMkJOSE82R1FjalVvZmZBTmwwK1ZkblJTWGEvNVRsZU1PUwotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server: https://10.0.2.150:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: dashboard-admin
name: dashboard-admin@kubernetes
current-context: dashboard-admin@kubernetes
kind: Config
preferences: {}
users:
- name: dashboard-admin
user:
token: eyJhbGciOiJSUzI1NiIsImtpZCI6InlBck13aTFMR2daR3htTmxSdG5XbGJjOVFLWmdMZlgzRU10TmJWRFNEMk0ifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjYyMjE0MjcwLCJpYXQiOjE2NjIyMTA2NzAsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJkYXNoYm9hcmQtYWRtaW4iLCJ1aWQiOiIwOTRhYWI2NC05NTkyLTRjYTctOWI3MS0yNDEwMmI5ODA1YjcifX0sIm5iZiI6MTY2MjIxMDY3MCwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmVybmV0ZXMtZGFzaGJvYXJkOmRhc2hib2FyZC1hZG1pbiJ9.bHK9jgpCAQIAxwurh05zzndo22hWEzoDnfFRS3VDWAfoD0YOsTF6RbHFSshn0Vm-Xv1sEIgmVkjgftP2Pq_saMs-WdgHfTLz2CjxWpkYV4WQcMs4WJq9Lx5SQeNxw9mEh8c085nnx368GWkENHSsldKP-O6YliWQAP8qpOiUWrJqhteVQi0GD7EYmOPlnKFZF2YKaROYFvn9P8JiCL8rRTZ5GUYIty9LRLkh3daFXj67krk4v3pNLqdHcKKwkv8vFN4hl6RbgA3nY
```

View File

@ -0,0 +1,684 @@
<h1><center>基于kubernetes部署Prometheus和Grafana</center></h1>
------
## 一:环境准备
#### 1.kubernetes集群正常
```shell
[root@master ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 36d v1.23.1
node-1 Ready <none> 36d v1.23.1
node-2 Ready <none> 36d v1.23.1
node-3 Ready <none> 36d v1.23.1
```
#### 2.harbor仓库正常
![image-20220602010601512](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602010601512.png)
## 二Prometheus部署
#### 1.node-exporter部署
node-exporter可以采集机器物理机、虚拟机、云主机等的监控指标数据能够采集到的指标包括CPU, 内存,磁盘,网络,文件数等信息
创建监控namespace
```shell
[root@master ~]# kubectl create ns monitor-sa
```
创建node-export.yaml
```shell
[root@master ~]# vim node-export.yaml
apiVersion: apps/v1
kind: DaemonSet # 可以保证k8s集群的每个节点都运行完全一样的pod
metadata:
name: node-exporter
namespace: monitor-sa
labels:
name: node-exporter
spec:
selector:
matchLabels:
name: node-exporter
template:
metadata:
labels:
name: node-exporter
spec:
hostPID: true
hostIPC: true
hostNetwork: true
containers:
- name: node-exporter
image: prom/node-exporter:v0.16.0
#image: 10.0.0.230/xingdian/node-exporter:v0.16.0
ports:
- containerPort: 9100
resources:
requests:
cpu: 0.15 # 这个容器运行至少需要0.15核cpu
securityContext:
privileged: true # 开启特权模式
args:
- --path.procfs
- /host/proc
- --path.sysfs
- /host/sys
- --collector.filesystem.ignored-mount-points
- '"^/(sys|proc|dev|host|etc)($|/)"'
volumeMounts:
- name: dev
mountPath: /host/dev
- name: proc
mountPath: /host/proc
- name: sys
mountPath: /host/sys
- name: rootfs
mountPath: /rootfs
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
volumes:
- name: proc
hostPath:
path: /proc
- name: dev
hostPath:
path: /dev
- name: sys
hostPath:
path: /sys
- name: rootfs
hostPath:
path: /
```
注意:
hostNetwork、hostIPC、hostPID都为True时表示这个Pod里的所有容器会直接使用宿主机的网络直接与宿主机进行IPC进程间通信通信可以看到宿主机里正在运行的所有进程。加入了hostNetwork:true会直接将我们的宿主机的9100端口映射出来从而不需要创建service 在我们的宿主机上就会有一个9100的端口
创建:
```shell
[root@master ~]# kubectl apply -f node-export.yaml
```
查看node-exporter是否部署成功
```shell
[root@master ~]# kubectl get pods -n monitor-sa -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
node-exporter-2cbrg 1/1 Running 0 34m 10.0.0.220 master <none> <none>
node-exporter-7rrbh 1/1 Running 0 34m 10.0.0.222 node-2 <none> <none>
node-exporter-96v29 1/1 Running 0 34m 10.0.0.221 node-1 <none> <none>
node-exporter-bf2j8 1/1 Running 0 34m 10.0.0.223 node-3 <none> <none>
```
注意:
node-export默认的监听端口是9100可以看到当前主机获取到的所有监控数据
```shell
[root@master ~]# curl http://10.0.0.220:9100/metrics | grep node_cpu_seconds
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP node_cpu_seconds_total Seconds the cpus spent in each mode.
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 8398.49
node_cpu_seconds_total{cpu="0",mode="iowait"} 1.54
node_cpu_seconds_total{cpu="0",mode="irq"} 0
node_cpu_seconds_total{cpu="0",mode="nice"} 0
node_cpu_seconds_total{cpu="0",mode="softirq"} 17.2
node_cpu_seconds_total{cpu="0",mode="steal"} 0
node_cpu_seconds_total{cpu="0",mode="system"} 70.61
node_cpu_seconds_total{cpu="0",mode="user"} 187.04
node_cpu_seconds_total{cpu="1",mode="idle"} 8403.82
node_cpu_seconds_total{cpu="1",mode="iowait"} 4.95
node_cpu_seconds_total{cpu="1",mode="irq"} 0
node_cpu_seconds_total{cpu="1",mode="nice"} 0
node_cpu_seconds_total{cpu="1",mode="softirq"} 16.75
node_cpu_seconds_total{cpu="1",mode="steal"} 0
node_cpu_seconds_total{cpu="1",mode="system"} 71.26
node_cpu_seconds_total{cpu="1",mode="user"} 190.27
100 74016 100 74016 0 0 5878k 0 --:--:-- --:--:-- --:--:-- 6023k
[root@master ~]# curl http://10.0.0.220:9100/metrics | grep node_load
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP node_loa
1 1m load average.
# TYPE node_load1 gauge
node_load1 0.2
# HELP node_load15 15m load average.
# TYPE node_load15 gauge
node_load15 0.22
# HELP node_load5 5m load average.
# TYPE node_load5 gauge
node_load5 0.2
100 74044 100 74044 0 0 8604k 0 --:--:-- --:--:-- --:--:-- 9038k
```
#### 2.Prometheus安装
创建sa账号对sa做rbac授权
```shell
# 创建一个sa账号monitor
[root@master ~]# kubectl create serviceaccount monitor -n monitor-sa
# 把sa账号monitor通过clusterrolebing绑定到clusterrole上
[root@master ~]# kubectl create clusterrolebinding monitor-clusterrolebinding -n monitor-sa --clusterrole=cluster-admin --serviceaccount=monitor-sa:monitor
```
创建prometheus数据存储目录
```shell
# 将prometheus调度到node-1节点
[root@node-1 ~]# mkdir /data && chmod 777 /data
```
创建一个configmap存储卷用来存放prometheus配置信息
```shell
[root@master ~]# vim prometheus-cfg.yaml
---
kind: ConfigMap
apiVersion: v1
metadata:
labels:
app: prometheus
name: prometheus-config
namespace: monitor-sa
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 1m
scrape_configs:
- job_name: 'kubernetes-node'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-node-cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-apiserver'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
```
创建:
```shell
[root@master ~]# kubectl apply -f prometheus-cfg.yaml
configmap/prometheus-config created
```
配置详解:
```shell
---
kind: ConfigMap
apiVersion: v1
metadata:
labels:
app: prometheus
name: prometheus-config
namespace: monitor-sa
data:
prometheus.yml: |
global:
scrape_interval: 15s #采集目标主机监控据的时间间隔
scrape_timeout: 10s # 数据采集超时时间默认10s
evaluation_interval: 1m #触发告警检测的时间默认是1m
scrape_configs: # 配置数据源称为target每个target用job_name命名。又分为静态配置和服务发现
- job_name: 'kubernetes-node'
kubernetes_sd_configs: # 使用的是k8s的服务发现
- role: node # 使用node角色它使用默认的kubelet提供的http端口来发现集群中每个node节点
relabel_configs: # 重新标记
- source_labels: [__address__] # 配置的原始标签,匹配地址
regex: '(.*):10250' #匹配带有10250端口的url
replacement: '${1}:9100' #把匹配到的ip:10250的ip保留
target_label: __address__ #新生成的url是${1}获取到的ip:9100
action: replace # 动作替换
- action: labelmap
regex: __meta_kubernetes_node_label_(.+) #匹配到下面正则表达式的标签会被保留,如果不做regex正则的话默认只是会显示instance标签
- job_name: 'kubernetes-node-cadvisor' # 抓取cAdvisor数据是获取kubelet上/metrics/cadvisor接口数据来获取容器的资源使用情况
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap # 把匹配到的标签保留
regex: __meta_kubernetes_node_label_(.+) #保留匹配到的具有__meta_kubernetes_node_label的标签
- target_label: __address__ # 获取到的地址__address__="192.168.40.180:10250"
replacement: kubernetes.default.svc:443 # 把获取到的地址替换成新的地址kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+) # 把原始标签中__meta_kubernetes_node_name值匹配到
target_label: __metrics_path__ #获取__metrics_path__对应的值
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
# 把metrics替换成新的值api/v1/nodes/k8s-master1/proxy/metrics/cadvisor
# ${1}是__meta_kubernetes_node_name获取到的值
# 新的url就是https://kubernetes.default.svc:443/api/v1/nodes/k8s-master1/proxy/metrics/cadvisor
- job_name: 'kubernetes-apiserver'
kubernetes_sd_configs:
- role: endpoints # 使用k8s中的endpoint服务发现采集apiserver 6443端口获取到的数据
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
# endpoint这个对象的名称空间,endpoint对象的服务名,exnpoint的端口名称
action: keep # 采集满足条件的实例,其他实例不采集
regex: default;kubernetes;https #正则匹配到的默认空间下的service名字是kubernetes协议是https的endpoint类型保留下来
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
# 重新打标仅抓取到的具有 "prometheus.io/scrape: true" 的annotation的端点意思是说如果某个service具有prometheus.io/scrape = true annotation声明则抓取annotation本身也是键值结构所以这里的源标签设置为键而regex设置值true当值匹配到regex设定的内容时则执行keep动作也就是保留其余则丢弃。
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
# 重新设置scheme匹配源标签__meta_kubernetes_service_annotation_prometheus_io_scheme也就是prometheus.io/scheme annotation如果源标签的值匹配到regex则把值替换为__scheme__对应的值。
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
# 应用中自定义暴露的指标也许你暴露的API接口不是/metrics这个路径那么你可以在这个POD对应的service中做一个"prometheus.io/path = /mymetrics" 声明上面的意思就是把你声明的这个路径赋值给__metrics_path__其实就是让prometheus来获取自定义应用暴露的metrices的具体路径不过这里写的要和service中做好约定如果service中这样写 prometheus.io/app-metrics-path: '/metrics' 那么你这里就要__meta_kubernetes_service_annotation_prometheus_io_app_metrics_path这样写。
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
# 暴露自定义的应用的端口就是把地址和你在service中定义的 "prometheus.io/port = <port>" 声明做一个拼接然后赋值给__address__这样prometheus就能获取自定义应用的端口然后通过这个端口再结合__metrics_path__来获取指标如果__metrics_path__值不是默认的/metrics那么就要使用上面的标签替换来获取真正暴露的具体路径。
- action: labelmap #保留下面匹配到的标签
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace # 替换__meta_kubernetes_namespace变成kubernetes_namespace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
```
通过deployment部署prometheus
```shell
[root@master ~]# cat prometheus-deploy.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-server
namespace: monitor-sa
labels:
app: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
component: server
#matchExpressions:
#- {key: app, operator: In, values: [prometheus]}
#- {key: component, operator: In, values: [server]}
template:
metadata:
labels:
app: prometheus
component: server
annotations:
prometheus.io/scrape: 'false'
spec:
nodeName: node-1 # 指定pod调度到哪个节点上
serviceAccountName: monitor
containers:
- name: prometheus
image: prom/prometheus:v2.2.1
#image: 10.0.0.230/xingdian/prometheus:v2.2.1
imagePullPolicy: IfNotPresent
command:
- prometheus
- --config.file=/etc/prometheus/prometheus.yml
- --storage.tsdb.path=/prometheus # 数据存储目录
- --storage.tsdb.retention=720h # 数据保存时长
- --web.enable-lifecycle # 开启热加载
ports:
- containerPort: 9090
protocol: TCP
volumeMounts:
- mountPath: /etc/prometheus/prometheus.yml
name: prometheus-config
subPath: prometheus.yml
- mountPath: /prometheus/
name: prometheus-storage-volume
volumes:
- name: prometheus-config
configMap:
name: prometheus-config
items:
- key: prometheus.yml
path: prometheus.yml
mode: 0644
- name: prometheus-storage-volume
hostPath:
path: /data
type: Directory
```
创建:
```shell
[root@master ~]# kubectl apply -f prometheus-deploy.yaml
deployment.apps/prometheus-server created
```
查看:
```shell
[root@master ~]# kubectl get pods -o wide -n monitor-sa
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
prometheus-server-59cb5d648-bxwrb 1/1 Running 0 14m 10.244.2.100 node-1 <none> <none>
```
#### 3.prometheus pod创建service
```shell
[root@master ~]# cat prometheus-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: monitor-sa
labels:
app: prometheus
spec:
type: NodePort
ports:
- port: 9090
targetPort: 9090
protocol: TCP
selector:
app: prometheus
component: server
```
创建:
```shell
[root@master ~]# kubectl apply -f prometheus-svc.yaml
service/prometheus created
```
查看service在物理机映射的端口
```shell
[root@master ~]# kubectl get svc -n monitor-sa
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
prometheus NodePort 10.106.61.80 <none> 9090:32169/TCP 32m
```
#### 4.web界面查看
![image-20220602011956600](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602011956600.png)
![image-20220602012012382](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602012012382.png)
#### 5.Prometheus热加载
```shell
# 为了每次修改配置文件可以热加载prometheus也就是不停止prometheus就可以使配置生效想要使配置生效可用如下热加载命令
[root@master ~]# kubectl get pods -n monitor-sa -o wide -l app=prometheus
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
prometheus-server-689fb8cdbc-kcsw2 1/1 Running 0 5m39s 10.244.36.70 k8s-node1 <none> <none>
# 想要使配置生效可用如下命令热加载:
[root@master ~]# curl -X POST http://10.244.36.70:9090/-/reload
# 查看log
[root@master ~]# kubectl logs -n monitor-sa prometheus-server-689fb8cdbc-kcsw2
```
注意:
```shell
# 热加载速度比较慢可以暴力重启prometheus如修改上面的prometheus-cfg.yaml文件之后可执行如下强制删除
[root@master ~]# kubectl delete -f prometheus-cfg.yaml
[root@master ~]# kubectl delete -f prometheus-deploy.yaml
# 然后再通过apply更新
[root@master ~]# kubectl apply -f prometheus-cfg.yaml
[root@master ~]# kubectl apply -f prometheus-deploy.yaml
#注意:线上最好热加载,暴力删除可能造成监控数据的丢失
```
## 三Grafana的部署
#### 1.Grafana介绍
Grafana是一个跨平台的开源的度量分析和可视化工具可以将采集的数据可视化的展示并及时通知给告警接收方
它主要有以下六大特点:
1展示方式快速灵活的客户端图表面板插件有许多不同方式的可视化指标和日志官方库中具有丰富的仪表盘插件比如热图、折线图、图表等多种展示方式
2数据源GraphiteInfluxDBOpenTSDBPrometheusElasticsearchCloudWatch和KairosDB等
3通知提醒以可视方式定义最重要指标的警报规则Grafana将不断计算并发送通知在数据达到阈值时通过Slack、PagerDuty等获得通知
4混合展示在同一图表中混合使用不同的数据源可以基于每个查询指定数据源甚至自定义数据源
5注释使用来自不同数据源的丰富事件注释图表将鼠标悬停在事件上会显示完整的事件元数据和标记
#### 2.Grafana安装
```shell
[root@master prome]# cat grafana.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: monitoring-grafana
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
task: monitoring
k8s-app: grafana
template:
metadata:
labels:
task: monitoring
k8s-app: grafana
spec:
containers:
- name: grafana
image: 10.0.0.230/xingdian/heapster-grafana-amd64:v5.0.4
#heleicool/heapster-grafana-amd64:v5.0.4
ports:
- containerPort: 3000
protocol: TCP
volumeMounts:
- mountPath: /etc/ssl/certs
name: ca-certificates
readOnly: true
- mountPath: /var
name: grafana-storage
env:
- name: INFLUXDB_HOST
value: monitoring-influxdb
- name: GF_SERVER_HTTP_PORT
value: "3000"
# The following env variables are required to make Grafana accessible via
# the kubernetes api-server proxy. On production clusters, we recommend
# removing these env variables, setup auth for grafana, and expose the grafana
# service using a LoadBalancer or a public IP.
- name: GF_AUTH_BASIC_ENABLED
value: "false"
- name: GF_AUTH_ANONYMOUS_ENABLED
value: "true"
- name: GF_AUTH_ANONYMOUS_ORG_ROLE
value: Admin
- name: GF_SERVER_ROOT_URL
# If you're only using the API Server proxy, set this value instead:
# value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
value: /
volumes:
- name: ca-certificates
hostPath:
path: /etc/ssl/certs
- name: grafana-storage
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
labels:
# For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
# If you are NOT using this as an addon, you should comment out this line.
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: monitoring-grafana
name: monitoring-grafana
namespace: kube-system
spec:
# In a production setup, we recommend accessing Grafana through an external Loadbalancer
# or through a public IP.
# type: LoadBalancer
# You could also use NodePort to expose the service at a randomly-generated port
# type: NodePort
ports:
- port: 80
targetPort: 3000
selector:
k8s-app: grafana
type: NodePort
```
创建:
```shell
[root@master prome]# kubectl apply -f grafana.yaml
deployment.apps/monitoring-grafana created
service/monitoring-grafana created
```
查看:
```shell
[root@master prome]# kubectl get pods -n kube-system -l task=monitoring -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
monitoring-grafana-7c5c6c7486-rbt62 1/1 Running 0 9s 10.244.1.83 node-3 <none> <none>
```
```shell
[root@master prome]# kubectl get svc -n kube-system | grep grafana
monitoring-grafana NodePort 10.101.77.194 <none> 80:30919/TCP 76s
```
## 四配置Grafana
浏览器访问:
![image-20220602013222284](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602013222284.png)
添加数据源:
![image-20220602013322234](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602013322234.png)
指定Prometheus地址
![image-20220602013441712](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602013441712.png)
导入监控模板:
![image-20220602013943317](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602013943317.png)
![image-20220602014027197](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602014027197.png)
注意:
官方下载监控模板https://grafana.com/dashboards?dataSource=prometheus&search=kubernetes
![image-20220602014152927](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602014152927.png)
![image-20220602014212551](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602014212551.png)
展示:
![image-20220602014306247](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602014306247.png)
![image-20220602014321106](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602014321106.png)
![image-20220602014337431](%E5%9F%BA%E4%BA%8Ekubernetes%E9%83%A8%E7%BD%B2Prometheus%E5%92%8CGrafana.assets/image-20220602014337431.png)