一、服务器集群架构规划
1.1 网络规划
网段 |
192.168.21.0/24 |
网关 |
192.168.21.2 |
可用IP地址 |
192.168.21.3-192.168.21.254 |
1.2 服务器规划
角色 | IP | 安装组件 |
VIP | 192.168.21.200 | HAProxyKeepalived |
k8s-master01 k8s-master02 k8s-master03 |
192.168.21.201 192.168.21.202 192.168.21.203 |
etcd、docker、kube-apiserver、kube-controller-manager、kube-scheduler、kubectl、calico、coredns、dashboard |
k8s-node01 k8s-node02 |
192.168.21.204 192.168.21.205 |
docker、kubelet、kube-proxy、calico、coredns |
二、主机环境配置(5台都需要操作)
服务器操作系统: AnolisOS-8.8-x86_64
2.1 修改主机名和字符集
hostnamectl set-hostname k8s-master01 #设置192.168.21.201主机名为k8s-master01
hostnamectl set-hostname k8s-master02 #设置192.168.21.202主机名为k8s-master02
hostnamectl set-hostname k8s-master03 #设置192.168.21.203主机名为k8s-master03
hostnamectl set-hostname k8s-node01 #设置192.168.21.204主机名为k8s-node01
hostnamectl set-hostname k8s-node02 #设置192.168.21.205主机名为k8s-node02
locale #查看默认字符集
echo "export LANG=en_US.UTF-8" >> /etc/profile #设置字符集
cat /etc/profile | grep -i lang #查看字符集
export LANG=en_US.UTF-8
2.2 添加hosts解析
vi /etc/hosts
192.168.21.201 k8s-master01
192.168.21.202 k8s-master02
192.168.21.203 k8s-master03
192.168.21.204 k8s-node01
192.168.21.205 k8s-node02
:wq! #保存退出
2.3 保存yum下载的安装包
vi /etc/yum.conf #保存路径为/var/cache/yum/
keepcache=1 #添加这一行
:wq! #保存退出
2.4 防火墙设置
CentOS 7.x默认使用的是firewall作为防火墙,这里改为iptables防火墙,并清空规则。
2.4.1关闭firewall:
systemctl stop firewalld.service #停止firewall
systemctl disable firewalld.service #禁止firewall开机启动
systemctl mask firewalld
systemctl stop firewalld
yum remove firewalld -y #卸载
2.4.2安装iptables防火墙
yum install iptables-services -y #安装
systemctl enable iptables.service #设置防火墙开机启动
iptables -F #清空规则
service iptables save #保存配置规则
systemctl restart iptables.service #启动防火墙使配置生效
cat /etc/sysconfig/iptables #查看防火墙配置文件
2.5 关闭selinux
sestatus #查看状态,显示disabled表示已经禁用
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config #禁用selinux
setenforce 0 #临时禁用
/usr/sbin/sestatus -v #查看selinux状态,disabled表示关闭
2.6 关闭交换分区
如果系统设置了swap交换分区,需要关闭
#显示 Swap 分区的详细信息,如果没有任何输出,表示当前系统没有配置 Swap 分区
swapon --show
swapoff -a #关闭
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab #修改配置,禁用 Swap 分区配置
free -m #查看Swap 分区信息
2.7 同步系统时间
把k8s-master01设置为时间服务器,让其他四台机器与它同步
也可以部署专门的时间服务器,让k8s集群内的所有机器与它同步
#设置服务器时区
rm -rf /etc/localtime #先删除默认的时区设置
ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime #替换上海/北京作为默认
vi /etc/sysconfig/clock #添加时区
Zone=Asia/Shanghai
:wq! #保存退出
yum install chrony -y #安装时间同步工具
systemctl start chronyd
systemctl enable chronyd
vi /etc/chrony.conf #编辑(在k8s-master01操作)
#pool ntp.aliyun.com iburst #注释掉
local stratum 10 #时间层级设置,配置为本地时间服务器
allow 192.168.21.0/24 #允许的ip段
:wq! #保存退出
vi /etc/chrony.conf #编辑(在另外4台上面操作)
#pool ntp.aliyun.com iburst #注释掉
server 192.168.21.201 iburst #添加此行,填上k8s-master01的ip地址
maxdistance 600.0
:wq! #保存退出
systemctl restart chronyd
chronyc sources #查看当前时间源,配置文件/etc/chrony.conf里面可以设置时间服务器地址
chronyc makestep #手动同步时间
chronyc tracking #检查 Chrony 的状态和时间同步情况
hwclock -w
hwclock --systohc #系统时钟和硬件时钟同步
date #显示系统时间
#可以自己修改时间
timedatectl set-ntp false #先关闭NTP同步
timedatectl set-time "2024-10-06 15:15:15"
date #可以看到时间已经修改
timedatectl set-ntp true # 打开NTP同步,时间会从服务端自动同步
watch -n 1 date #显示实时时间
2.8 升级系统内核
默认已经是高版本的内核了5.10.134-13.an8.x86_64,不需要升级,安装k8s的系统至少使用3.10及以上内核
grubby --default-kernel
grub2-editenv list
uname -r
#如果需要升级内核可以参考
CentOS 升级系统内核到最新版
https://www.osyunwei.com/archives/11582.html
2.9 调整系统内核参数
2.9.1
#执行以下命令
modprobe br_netfilter
modprobe ip_vs
modprobe ip_conntrack
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack
cat >> /etc/rc.d/rc.local << EOF
modprobe br_netfilter
modprobe ip_vs
modprobe ip_conntrack
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack
EOF
chmod +x /etc/rc.d/rc.local
#高版本的内核nf_conntrack_ipv4被nf_conntrack替换了
2.9.2
vi /etc/security/limits.conf #在最后一行添加以下代码,这个值要小于fs.nr_open的值
* soft nproc unlimited
* hard nproc unlimited
* soft nofile 1000000
* hard nofile 1000000
:wq! #保存退出
2.9.3
vi /etc/sysctl.conf #在最后一行添加以下代码
fs.file-max = 65535000
fs.nr_open = 65535000
kernel.pid_max= 4194303
vm.swappiness = 0
:wq! #保存退出
/sbin/sysctl -p
#查看hard和soft限制数
ulimit -Hn
ulimit -Sn
2.9.4
vi /etc/sysctl.d/kubernetes.conf
net.bridge.bridge-nf-call-iptables = 1 #将桥接的IPv4流量传递到iptables
net.bridge.bridge-nf-call-ip6tables = 1 #将桥接的IPv6流量传递到iptables
net.bridge.bridge-nf-call-arptables=1
net.ipv4.ip_forward = 1
net.ipv4.ip_nonlocal_bind = 1 #允许服务绑定一个本机不存在的IP地址,haproxy部署高可用使用vip时会用到
vm.swappiness = 0
vm.overcommit_memory = 1
vm.panic_on_oom = 0
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 1048576
fs.file-max = 52706963
fs.nr_open = 52706963
net.ipv6.conf.all.disable_ipv6 = 1
net.netfilter.nf_conntrack_max = 2310720
:wq! #保存退出
sysctl -p /etc/sysctl.d/kubernetes.conf
sysctl --system
2.9.5开启IPVS支持
vi /etc/sysconfig/modules/ipvs.modules
#!/bin/bash
ipvs_modules="ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed ip_vs_ftp nf_conntrack"
for kernel_module in ${ipvs_modules}; do
/sbin/modinfo -F filename ${kernel_module} > /dev/null 2>&1
if [ $? -eq 0 ]; then
/sbin/modprobe ${kernel_module}
fi
done
:wq! #保存退出
#执行以下命令使配置生效
chmod 755 /etc/sysconfig/modules/ipvs.modules
sh /etc/sysconfig/modules/ipvs.modules
lsmod | grep ip_vs
2.10 安装系统依赖包
yum install -y ipset ipvsadm
yum install -y openssl-devel openssl
yum install -y gcc gcc-c++
yum install -y telnet iproute iproute-tc jq bzip2 tar conntrack conntrack-tools sysstat curl iptables libseccomp lrzsz git unzip vim net-tools epel-release nfs-utils wget make yum-utils device-mapper-persistent-data lvm2
2.11 配置免密码登录
只在在k8s-master01上操作就行
ssh-keygen #输入命令,按三次回车,会生成私钥和公钥
cd /root/.ssh #进入目录,会看到生成的私钥和公钥
#拷贝公钥
ssh-copy-id root@192.168.21.202 #输入192.168.21.202的root密码
ssh-copy-id root@192.168.21.203 #输入192.168.21.203的root密码
ssh-copy-id root@192.168.21.204 #输入192.168.21.204的root密码
ssh-copy-id root@192.168.21.205 #输入192.168.21.205的root密码
后面在master01节点直接输入ssh root@192.168.21.202/203/204/205登录服务器不用再输密码
三、安装containerd(5台都需要操作)
从 Kubernetes 1.24 开始,官方强烈推荐使用 containerd 作为 Kubernetes 的容器运行时
3.1 使用二进制方式安装containerd
下载安装包
wget https://github.com/containerd/containerd/releases/download/v1.7.23/cri-containerd-1.7.23-linux-amd64.tar.gz
解压到根目录
tar -C / -zxf cri-containerd-1.7.23-linux-amd64.tar.gz
#生成配置文件并修改
mkdir -p /etc/containerd #创建配置文件目录
containerd config default > /etc/containerd/config.toml
vi /etc/containerd/config.toml
disabled_plugins = [] #修改
sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.8" #修改为阿里云的地址
SystemdCgroup = true #修改
:wq! #保存退出
systemctl daemon-reload
systemctl enable containerd
systemctl start containerd
systemctl restart containerd
systemctl status containerd
#查看版本信息
ctr version
#拉取pause镜像
ctr -n k8s.io images pull --platform=amd64 registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.8
#查看默认命名空间的镜像
ctr -n k8s.io images ls
3.2 安装runC
由于二进制包中的runC默认需要系统中安装seccomp支持,不同版本runC对seccomp版本要求不一样
建议单独下载runC 二进制包进行安装,里面包含了seccomp模块支持
wget https://github.com/opencontainers/runc/releases/download/v1.1.15/runc.amd64
mv runc.amd64 /usr/sbin/runc
chmod +x /usr/sbin/runc
runc -v #查看版本
3.3 安装containerd的命令行工具nerdctl
nerdctl是一个与docker cli风格兼容的containerd的cli工具
nerdctl 官方发布包含两个安装版本:
最小(nerdctl-1.7.7-linux-amd64.tar.gz):仅限 nerdctl
完整(nerdctl-full-1.7.7-linux-amd64.tar.gz):包括 containerd、runc 和 CNI 等依赖项
我们已经安装了 containerd ,所以选择nerdctl-1.7.7-linux-amd64.tar.gz
wget https://github.com/containerd/nerdctl/releases/download/v1.7.7/nerdctl-1.7.7-linux-amd64.tar.gz
tar -zxf nerdctl-1.7.7-linux-amd64.tar.gz -C /usr/local/bin
nerdctl version #查看版本
#配置nerdctl 命令自动补全
yum install bash-completion -y
nerdctl completion bash > /etc/bash_completion.d/nerdctl
source /etc/bash_completion.d/nerdctl
#添加nerdctl别名为 docker
vi /usr/local/bin/docker
#!/bin/bash
/usr/local/bin/nerdctl $@
:wq! #保存退出
chmod +x /usr/local/bin/docker
#测试
nerdctl image
docker image
四、安装cfssl证书工具(在1台master节点操作)
cfssl是一个开源的证书管理工具,使用json文件生成证书,相比openssl更方便使用。
在需要生成证书的服务器上安装即可,这里安装在k8s-master01节点上。
需要下载3个文件
1、cfssl
https://github.com/cloudflare/cfssl/releases/download/v1.6.5/cfssl_1.6.5_linux_amd64
2、cfssljson
https://github.com/cloudflare/cfssl/releases/download/v1.6.5/cfssljson_1.6.5_linux_amd64
3、cfssl-certinfo
https://github.com/cloudflare/cfssl/releases/download/v1.6.5/cfssl-certinfo_1.6.5_linux_amd64
#拷贝这3个文件到/usr/local/bin/目录下
cp cfssl_1.6.5_linux_amd64 /usr/local/bin/cfssl
cp cfssljson_1.6.5_linux_amd64 /usr/local/bin/cfssljson
cp cfssl-certinfo_1.6.5_linux_amd64 /usr/local/bin/cfssl-certinfo
#添加执行权限
chmod +x /usr/local/bin/cfssl
chmod +x /usr/local/bin/cfssljson
chmod +x /usr/local/bin/cfssl-certinfo
cfssl version #查看版本
五、部署Etcd集群(在3台master节点操作)
Etcd是一个分布式键值存储系统,Kubernetes使用Etcd进行数据存储,所以要先准备一个Etcd数据库,为解决Etcd单点故障,应采用集群方式部署,这里使用3台组建集群,可容忍1台机器故障。由于Etcd集群需要选举产生 leader,所以集群节点数目需要为奇数来保证正常进行选举。
说明:
使用5台组建集群,可容忍2台机器故障
使用7台组建集群,可容忍3台机器故障,
使用9台组建集群,可容忍4台机器故障
etcd集群也可以与k8s节点机器复用,只要apiserver能连接到就行
这里使用master节点的3台服务器部署etcd集群
先在一台k8s-master01服务器上操作
5.1 生成Etcd证书
5.1.1自签etcd证书颁发机构(CA)
创建工作目录
mkdir -p /opt/tls/etcd
cd /opt/tls/etcd
创建ca配置文件
cat > etcdca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"etcd": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
创建ca证书签名请求文件
cat > etcdca-csr.json << EOF
{
"CN": "etcdca",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing"
}
]
}
EOF
生成证书(etcdca.pem和etcdca-key.pem)命令:
cfssl gencert -initca etcdca-csr.json | cfssljson -bare etcdca –
5.1.2使用自签CA签发Etcd HTTPS证书
创建证书申请文件:
cd /opt/tls/etcd
cat > etcd-csr.json << EOF
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"192.168.21.201",
"192.168.21.202",
"192.168.21.203",
"k8s-master01",
"k8s-master02",
"k8s-master03"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing"
}
]
}
EOF
注:上述文件hosts字段中ip为etcd集群服务器ip地址,一个都不能少,为了方便后期扩容可以多写几个规划的ip。
生成证书命令(etcd.pem和etcd-key.pem):
cfssl gencert -ca=etcdca.pem -ca-key=etcdca-key.pem -config=etcdca-config.json -profile=etcd etcd-csr.json | cfssljson -bare etcd
#证书文件已经生成了,在/opt/tls/etcd目录,后面会用到。
5.2 安装Etcd
使用二进制文件来安装,先在一台k8s-master01服务器上操作
#下载二进制软件包etcd-v3.5.16-linux-amd64.tar.gz
https://github.com/etcd-io/etcd/releases/download/v3.5.16/etcd-v3.5.16-linux-amd64.tar.gz
#创建工作目录并解压二进制包
mkdir /opt/etcd/{bin,cfg,ssl,data,check} -p
tar zxvf etcd-v3.5.16-linux-amd64.tar.gz
mv etcd-v3.5.16-linux-amd64/etcd* /opt/etcd/bin/
#添加执行权限
chmod +x /opt/etcd/bin/*
vi /etc/profile #把etcd服务加入系统环境变量,在最后添加下面这一行
export PATH=$PATH:/opt/etcd/bin/
:wq! #保存退出
source /etc/profile #使配置立即生效
#查看版本
etcd --version
5.3 创建Etcd配置文件
cat > /opt/etcd/cfg/etcd.yaml << EOF
# [Member]
name: "k8s-master01"
data-dir: "/opt/etcd/data/"
wal-dir: "/opt/etcd/data/"
listen-peer-urls: "https://192.168.21.201:2380"
listen-client-urls: "https://192.168.21.201:2379,https://127.0.0.1:2379"
logger: "zap"
# [Clustering]
initial-advertise-peer-urls: "https://192.168.21.201:2380"
advertise-client-urls: "https://192.168.21.201:2379"
initial-cluster: "k8s-master01=https://192.168.21.201:2380,k8s-master02=https://192.168.21.202:2380,k8s-master03=https://192.168.21.203:2380"
initial-cluster-token: "etcd-cluster"
initial-cluster-state: "new"
# [Security]
client-transport-security:
cert-file: "/opt/etcd/ssl/etcd.pem"
key-file: "/opt/etcd/ssl/etcd-key.pem"
client-cert-auth: true
trusted-ca-file: "/opt/etcd/ssl/etcdca.pem"
auto-tls: true
peer-transport-security:
key-file: "/opt/etcd/ssl/etcd-key.pem"
cert-file: "/opt/etcd/ssl/etcd.pem"
client-cert-auth: true
trusted-ca-file: "/opt/etcd/ssl/etcdca.pem"
auto-tls: true
EOF
#特别注意yaml文件的格式缩进
5.4 设置systemd管理Etcd
cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
User=root
Type=notify
WorkingDirectory=/opt/etcd/data/
Restart=always
#Restart=on-failure
RestartSec=10s
LimitNOFILE=65536
ExecStart=/opt/etcd/bin/etcd --config-file=/opt/etcd/cfg/etcd.yaml
[Install]
WantedBy=multi-user.target
EOF
5.5 拷贝Etcd证书文件
cp /opt/tls/etcd/etcdca.pem /opt/etcd/ssl/
cp /opt/tls/etcd/etcdca-key.pem /opt/etcd/ssl/
cp /opt/tls/etcd/etcd.pem /opt/etcd/ssl/
cp /opt/tls/etcd/etcd-key.pem /opt/etcd/ssl/
5.6 分发Etcd安装配置文件
在其中一台k8s-master01服务器上操作完成之后,需要把etcd安装配置文件分发到etcd集群内所有节点上。
当然也可以在集群内每一台服务器上重复上面的步骤进行安装。
scp -r /opt/etcd/ root@192.168.21.202:/opt/
scp -r /opt/etcd/ root@192.168.21.203:/opt/
scp /usr/lib/systemd/system/etcd.service root@192.168.21.202:/usr/lib/systemd/system/
scp /usr/lib/systemd/system/etcd.service root@192.168.21.203:/usr/lib/systemd/system/
#然后在两台服务器上分别修改etcd.yaml配置文件中的主机名和当前服务器ip地址
vi /opt/etcd/cfg/etcd.yaml
# [Member]
name: "k8s-master01" #修改为每个节点自己的主机名
data-dir: "/opt/etcd/data/"
wal-dir: "/opt/etcd/data/"
listen-peer-urls: "https://192.168.21.201:2380" #修改为每个节点自己的ip地址
#修改为每个节点自己的ip地址
listen-client-urls: "https://192.168.21.201:2379,https://127.0.0.1:2379"
logger: "zap"
# [Clustering]
initial-advertise-peer-urls: "https://192.168.21.201:2380" #修改为每个节点自己的ip地址
advertise-client-urls: "https://192.168.21.201:2379" #修改为每个节点自己的ip地址
#下面的参数三个节点都一样
initial-cluster: "k8s-master01=https://192.168.21.201:2380,k8s-master02=https://192.168.21.202:2380,k8s-master03=https://192.168.21.203:2380"
initial-cluster-token: "etcd-cluster"
initial-cluster-state: "new"
# [Security]
client-transport-security:
cert-file: "/opt/etcd/ssl/etcd.pem"
key-file: "/opt/etcd/ssl/etcd-key.pem"
client-cert-auth: true
trusted-ca-file: "/opt/etcd/ssl/etcdca.pem"
auto-tls: true
peer-transport-security:
key-file: "/opt/etcd/ssl/etcd-key.pem"
cert-file: "/opt/etcd/ssl/etcd.pem"
client-cert-auth: true
trusted-ca-file: "/opt/etcd/ssl/etcdca.pem"
auto-tls: true
:wq! #保存退出
#在两台服务器上操作,把etcd服务加入系统环境变量
vi /etc/profile #在最后添加下面这一行
export PATH=$PATH:/opt/etcd/bin/
:wq! #保存退出
source /etc/profile #使配置立即生效
5.7 启动Etcd并设置开机启动
同时启动三台服务器上的etcd
systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
如果有问题先看日志: journalctl -u etcd systemctl status etcd
然后根据日志提示再排查解决问题
5.8 查看集群状态
ETCDCTL_API=3 /opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379" --write-out=table endpoint health
+-----------------------------+--------+-------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+-----------------------------+--------+-------------+-------+
| https://192.168.21.203:2379 | true | 20.816075ms | |
| https://192.168.21.202:2379 | true | 22.193996ms | |
| https://192.168.21.201:2379 | true | 21.069051ms | |
+-----------------------------+--------+-------------+-------+
vi /opt/etcd/check/check_etcd.sh
#!/bin/bash
# 设置基本参数
ETCDCTL="/opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --write-out=table --endpoints=https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"
# 检查是否设置环境变量
export ETCDCTL_API=3
# 根据输入参数执行不同的操作
case "$1" in
"health")
echo "Checking cluster endpoint health..."
$ETCDCTL endpoint health
;;
"status")
echo "Listing all endpoint statuses..."
$ETCDCTL endpoint status
;;
"list")
echo "Listing all cluster members..."
$ETCDCTL member list
;;
*)
echo "Usage: $0 {health|status|list}"
echo "Please specify a valid command."
exit 1
;;
esac
:wq! #保存退出
chmod +x /opt/etcd/check/check_etcd.sh #添加执行权限
sh /opt/etcd/check/check_etcd.sh health
sh /opt/etcd/check/check_etcd.sh status
sh /opt/etcd/check/check_etcd.sh list
5.9 模拟集群节点故障
模拟集群中1个节点192.168.21.203发生故障,故障修复后,必须以新的身份加入集群
systemctl stop etcd #停止etcd服务
5.9.1检查集群健康状态
/opt/etcd/bin/etcdctl endpoint health -w table --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"
/opt/etcd/bin/etcdctl member list -w table --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"
#查询到故障节点的ID
5.9.2移除故障节点
/opt/etcd/bin/etcdctl member remove 83045a3c3a751464 --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"
5.9.3清空故障节点的数据目录(在故障节点操作)
cd /opt/etcd/data
rm -rf *
5.9.4扩容集群
使用member add命令重新添加故障节点
/opt/etcd/bin/etcdctl member add k8s-master03 --peer-urls=https://192.168.21.203:2380 --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"
5.9.5在故障节点上修改配置文件参数 initial-cluster-state: "existing" 并启动etcd
vi /opt/etcd/cfg/etcd.yaml
initial-cluster-state: "existing"
:wq! #保存退出
systemctl start etcd #启动
5.9.6查询集群状态
/opt/etcd/bin/etcdctl endpoint status --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"
5.9.7修改集群所有节点的配置参数
新建集群的时候,initial-cluster-state参数这个值为new,当集群正常运行时,所有节点的 initial-cluster-state参数应该设置为"existing",可以让节点在重启时直接加入现有集群,防止节点重启后又重新初始化集群,导致集群不一致,
vi /opt/etcd/cfg/etcd.yaml
initial-cluster-state: "existing"
:wq! #保存退出
systemctl reload-daemon
systemctl restart etcd
六、部署k8s-master节点
先在k8s-master01这一台上操作
6.1 下载kubernetes二进制包
官方网站:https://kubernetes.io/zh-cn/releases/download/
下载地址:https://www.downloadkubernetes.com/ https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.31.md
注:打开链接你会发现里面有很多包,下载一个server包就够了,包含了master和node等所有组件的二进制文件。
mkdir -p /opt/k8s/soft
cd /opt/k8s/soft
#下载文件并且以版本号重新命名
wget https://dl.k8s.io/v1.31.2/kubernetes-server-linux-amd64.tar.gz -O kubernetes-server-linux-amd64-v1.31.2.tar.gz
6.2 解压二进制包到安装目录
#创建k8s安装目录
mkdir -p /opt/kubernetes/{bin,cfg,ssl,logs}
#解压
cd /opt/k8s/soft
tar zxvf kubernetes-server-linux-amd64-v1.31.2.tar.gz
#拷贝相关组件到安装目录
cd /opt/k8s/soft/kubernetes/server/bin
cp kube-apiserver kube-controller-manager kube-scheduler /opt/kubernetes/bin
#拷贝kubectl命令到系统运行目录,kubectl用来管理Kubernetes集群
cp kubectl /usr/bin/
cp kubectl /usr/local/bin
6.3 部署kube-apiserver组件
6.3.1自签kube-apiserver证书颁发机构(CA)
自签证书颁发机构(CA)
mkdir -p /opt/tls/k8s #创建工作目录
cd /opt/tls/k8s
创建ca配置文件
cat > k8sca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
创建ca证书签名请求文件
cat > k8sca-csr.json << EOF
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
#生成证书命令(k8sca.pem和k8sca-key.pem)
cfssl gencert -initca k8sca-csr.json | cfssljson -bare k8sca -
6.3.2使用自签ca签发kube-apiserver HTTPS证书
创建证书申请文件:
cat > apiserver-csr.json << EOF
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"10.0.0.1",
"192.168.21.200",
"192.168.21.201",
"192.168.21.202",
"192.168.21.203",
" k8s-master01",
" k8s-master02",
" k8s-master03",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
注:上述文件hosts字段中ip为所有master节点/负载均衡LB/VIP的ip地址,一个都不能少,为了方便后期扩容可以多写几个规划的ip。
#生成证书命令(apiserver.pem和apiserver-key.pem)
cfssl gencert -ca=k8sca.pem -ca-key=k8sca-key.pem -config=k8sca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver
6.3.3拷贝刚才生成的证书
#把刚才生成的证书拷贝到k8s配置文件中的路径
cp /opt/tls/k8s/k8sca.pem /opt/kubernetes/ssl/
cp /opt/tls/k8s/k8sca-key.pem /opt/kubernetes/ssl/
cp /opt/tls/k8s/apiserver.pem /opt/kubernetes/ssl/
cp /opt/tls/k8s/apiserver-key.pem /opt/kubernetes/ssl/
6.3.4创建kube-apiserver配置文件
cat > /opt/kubernetes/cfg/kube-apiserver.conf << EOF
KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \\
--v=2 \\
--etcd-servers=https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379 \\
--bind-address=192.168.21.201 \\
--secure-port=6443 \\
--advertise-address=192.168.21.201 \\
--allow-privileged=true \\
--service-cluster-ip-range=10.0.0.0/24 \\
--authorization-mode=RBAC,Node \\
--enable-bootstrap-token-auth=true \\
--token-auth-file=/opt/kubernetes/cfg/token.csv \\
--service-node-port-range=30000-50000 \\
--kubelet-client-certificate=/opt/kubernetes/ssl/apiserver.pem \\
--kubelet-client-key=/opt/kubernetes/ssl/apiserver-key.pem \\
--tls-cert-file=/opt/kubernetes/ssl/apiserver.pem \\
--tls-private-key-file=/opt/kubernetes/ssl/apiserver-key.pem \\
--client-ca-file=/opt/kubernetes/ssl/k8sca.pem \\
--service-account-key-file=/opt/kubernetes/ssl/k8sca-key.pem \\
--service-account-issuer=api \\
--service-account-signing-key-file=/opt/kubernetes/ssl/k8sca-key.pem \\
--etcd-cafile=/opt/etcd/ssl/etcdca.pem \\
--etcd-certfile=/opt/etcd/ssl/etcd.pem \\
--etcd-keyfile=/opt/etcd/ssl/etcd-key.pem \\
--requestheader-client-ca-file=/opt/kubernetes/ssl/k8sca.pem \\
--proxy-client-cert-file=/opt/kubernetes/ssl/apiserver.pem \\
--proxy-client-key-file=/opt/kubernetes/ssl/apiserver-key.pem \\
--requestheader-allowed-names=kubernetes \\
--requestheader-extra-headers-prefix=X-Remote-Extra- \\
--requestheader-group-headers=X-Remote-Group \\
--requestheader-username-headers=X-Remote-User \\
--enable-aggregator-routing=true \\
--audit-log-maxage=30 \\
--audit-log-maxbackup=3 \\
--audit-log-maxsize=100 \\
--service-account-issuer=https://kubernetes.default.svc.cluster.local \\
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \\
--audit-log-path=/opt/kubernetes/logs/k8s-audit.log"
EOF
6.3.5创建上述配置文件中token文件
cat > /opt/kubernetes/cfg/token.csv << EOF
6215e7cd039e7756a3494a34af47d8e3,kubelet-bootstrap,10001,"system:node-bootstrapper"
EOF
格式:token,用户名,UID,用户组
#token也可自行生成替换:
head -c 16 /dev/urandom | od -An -t x | tr -d ' '
[root@k8s-master1 ~]# head -c 16 /dev/urandom | od -An -t x | tr -d ' '
6215e7cd039e7756a3494a34af47d8e3
6.3.6使用systemd管理apiserver
cat > /usr/lib/systemd/system/kube-apiserver.service << EOF
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-apiserver.conf
ExecStart=/opt/kubernetes/bin/kube-apiserver \$KUBE_APISERVER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
6.3.7启动并设置开机启动
systemctl daemon-reload
systemctl start kube-apiserver
systemctl enable kube-apiserver
systemctl stop kube-apiserver
systemctl restart kube-apiserver
systemctl status kube-apiserver
6.3.8查看kube-apiserver状态
#查看端口
netstat -lntup
#检查运行状态
systemctl status kube-apiserver |grep Active
#检查进程
ps -ef |grep kube
#查看日志
journalctl -u kube-apiserver
journalctl -u kube-apiserver -f
journalctl -u kube-apiserver > log.txt
6.4 部署kube-controller-manager组件
6.4.1生成kube-controller-manager证书
#切换工作目录
cd /opt/tls/k8s
#创建证书请求文件
cat > kube-controller-manager-csr.json << EOF
{
"CN": "system:kube-controller-manager",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF
# 生成证书
cfssl gencert -ca=k8sca.pem -ca-key=k8sca-key.pem -config=k8sca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
6.4.2生成kubeconfig文件(以下是shell命令,直接在终端执行):
cd /opt/tls/k8s
KUBE_CONFIG="/opt/kubernetes/cfg/kube-controller-manager.kubeconfig"
KUBE_APISERVER="https://192.168.21.201:6443"
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/kubernetes/ssl/k8sca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-controller-manager \
--client-certificate=./kube-controller-manager.pem \
--client-key=./kube-controller-manager-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-controller-manager \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
6.4.3创建kube-controller-manager配置文件
cat > /opt/kubernetes/cfg/kube-controller-manager.conf << EOF
KUBE_CONTROLLER_MANAGER_OPTS=" \\
--v=2 \\
--leader-elect=true \\
--kubeconfig=/opt/kubernetes/cfg/kube-controller-manager.kubeconfig \\
--bind-address=127.0.0.1 \\
--allocate-node-cidrs=true \\
--cluster-cidr=10.244.0.0/16 \\
--service-cluster-ip-range=10.0.0.0/24 \\
--cluster-signing-cert-file=/opt/kubernetes/ssl/k8sca.pem \\
--cluster-signing-key-file=/opt/kubernetes/ssl/k8sca-key.pem \\
--root-ca-file=/opt/kubernetes/ssl/k8sca.pem \\
--service-account-private-key-file=/opt/kubernetes/ssl/k8sca-key.pem \\
--cluster-signing-duration=87600h0m0s"
EOF
6.4.4配置systemd管理controller-manager
cat > /usr/lib/systemd/system/kube-controller-manager.service << EOF
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-controller-manager.conf
ExecStart=/opt/kubernetes/bin/kube-controller-manager \$KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
6.4.5启动并设置开机启动
systemctl daemon-reload
systemctl start kube-controller-manager
systemctl enable kube-controller-manager
systemctl stop kube-controller-manager
systemctl restart kube-controller-manager
systemctl status kube-controller-manager
6.4.6检查启动结果
ps -ef | grep kube
systemctl status kube-controller-manager |grep Active
[root@k8s-master1 k8s]# systemctl status kube-controller-manager |grep Active
Active: active (running) since Wed 2021-12-15 13:38:09 CST; 3min 50s ago
确保状态为 active (running),否则查看日志,确认原因
如果出现异常,通过如下命令查看
journalctl -u kube-controller-manager
6.5 部署kube-scheduler组件
6.5.1生成kube-scheduler证书
#切换工作目录
cd /opt/tls/k8s
#创建证书请求文件
cat > kube-scheduler-csr.json << EOF
{
"CN": "system:kube-scheduler",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF
#生成证书
cfssl gencert -ca=k8sca.pem -ca-key=k8sca-key.pem -config=k8sca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
6.5.2生成kubeconfig文件
cd /opt/tls/k8s
#以下是shell命令,直接在终端执行
KUBE_CONFIG="/opt/kubernetes/cfg/kube-scheduler.kubeconfig"
KUBE_APISERVER="https://192.168.21.201:6443"
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/kubernetes/ssl/k8sca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-scheduler \
--client-certificate=./kube-scheduler.pem \
--client-key=./kube-scheduler-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-scheduler \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
6.5.3创建kube-scheduler配置文件
cat > /opt/kubernetes/cfg/kube-scheduler.conf << EOF
KUBE_SCHEDULER_OPTS=" \\
--v=2 \\
--leader-elect \\
--kubeconfig=/opt/kubernetes/cfg/kube-scheduler.kubeconfig \\
--bind-address=127.0.0.1"
EOF
6.5.4配置systemd管理kube-scheduler
cat > /usr/lib/systemd/system/kube-scheduler.service << EOF
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-scheduler.conf
ExecStart=/opt/kubernetes/bin/kube-scheduler \$KUBE_SCHEDULER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
6.5.5启动并设置开机启动
systemctl daemon-reload
systemctl start kube-scheduler
systemctl enable kube-scheduler
systemctl stop kube-scheduler
systemctl restart kube-scheduler
systemctl status kube-scheduler
6.5.6检查运行状态
systemctl status kube-scheduler |grep Active
确保状态为 active (running),否则查看日志,确认原因
如果出现异常,通过如下命令查看
journalctl -u kube-scheduler
6.6 配置kubectl集群管理工具
kubectl是用来集群管理的工具,kubectl使用https协议与kube-apiserver进行安全通信,kube-apiserver对kubectl请求包含的证书进行认证和授权,需要最高权限的admin证书。
6.6.1创建kubectl证书
cd /opt/tls/k8s
#创建证书请求文件
cat > admin-csr.json <<EOF
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF
#生成证书
#该证书只会被kubectl当做client证书使用,所以hosts字段为空
cfssl gencert -ca=k8sca.pem -ca-key=k8sca-key.pem -config=k8sca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
6.6.2生成kubeconfig文件
#创建默认目录
mkdir /root/.kube
KUBE_CONFIG="/root/.kube/config"
KUBE_APISERVER="https://192.168.21.201:6443"
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/kubernetes/ssl/k8sca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials cluster-admin \
--client-certificate=./admin.pem \
--client-key=./admin-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=cluster-admin \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
6.6.3通过kubectl工具查看当前集群组件状态
kubectl get componentstatuses
kubectl get cs
6.6.4 授权kubelet-bootstrap用户允许请求证书
kubectl create clusterrolebinding kubelet-bootstrap \
--clusterrole=system:node-bootstrapper \
--user=kubelet-bootstrap
#如果执行上面的命令出错,是因为已经有此文件就不会重新覆盖,里边包含了ssl文件就不会更新成最新的clusterrolebinding,老是检查失败从而导致出错,删除此文件,重新生成即可。
kubectl delete clusterrolebinding kubelet-bootstrap #删除命令
七、部署k8s-node节点
先在k8s-master01这一台上操作
7.1 拷贝二进制文件到安装目录
cp /opt/k8s/soft/kubernetes/server/bin/kubelet /opt/kubernetes/bin
cp /opt/k8s/soft/kubernetes/server/bin/kube-proxy /opt/kubernetes/bin
7.2 部署kubelet组件
7.2.1创建配置文件
cat > /opt/kubernetes/cfg/kubelet.conf << EOF
KUBELET_OPTS=" \\
--v=2 \\
--hostname-override=k8s-master01 \\
--kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \\
--bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \\
--config=/opt/kubernetes/cfg/kubelet-config.yml \\
--cert-dir=/opt/kubernetes/ssl \\
--runtime-request-timeout=15m \\
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \\
--cgroup-driver=systemd \\
--node-labels=node.kubernetes.io/node=''"
EOF
7.2.2配置参数文件
cat > /opt/kubernetes/cfg/kubelet-config.yml << EOF
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
port: 10250
readOnlyPort: 10255
cgroupDriver: cgroupfs
clusterDNS:
- 10.0.0.2
clusterDomain: cluster.local
failSwapOn: false
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /opt/kubernetes/ssl/k8sca.pem
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
maxOpenFiles: 1000000
maxPods: 110
EOF
7.2.3生成kubelet初次加入集群引导kubeconfig文件
KUBE_CONFIG="/opt/kubernetes/cfg/bootstrap.kubeconfig"
KUBE_APISERVER="https://192.168.21.201:6443"
TOKEN="6215e7cd039e7756a3494a34af47d8e3"
# 生成 kubelet bootstrap kubeconfig 配置文件
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/kubernetes/ssl/k8sca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials "kubelet-bootstrap" \
--token=${TOKEN} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user="kubelet-bootstrap" \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
7.2.4配置systemd管理kubelet
cat > /usr/lib/systemd/system/kubelet.service << EOF
[Unit]
Description=Kubernetes Kubelet
After=docker.service
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kubelet.conf
ExecStart=/opt/kubernetes/bin/kubelet \$KUBELET_OPTS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
7.2.5启动并设置开机启动
systemctl daemon-reload
systemctl start kubelet
systemctl enable kubelet
systemctl stop kubelet
systemctl restart kubelet
systemctl status kubelet
7.2.6检查启动状态
ps -ef |grep kubelet
systemctl status kubelet |grep Active
如果出现异常,通过如下命令查看
journalctl -u kubelet
7.2.7批准kubelet证书申请并加入集群
#在k8s-master01上查看kubelet证书请求
kubectl get csr
[root@k8s-master01 bin]# kubectl get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
node-csr-QZj2pe7OtLh8vDBP8he6tTYoGzGXIoZCnizFxWNcU2U 7m34s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap <none> Pending
#在k8s-master01上批准申请
kubectl certificate approve node-csr-QZj2pe7OtLh8vDBP8he6tTYoGzGXIoZCnizFxWNcU2U
#查看节点
kubectl get node
[root@k8s-master01 bin]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01 NotReady <none> 18h v1.31.2
注:由于网络插件还没有部署,节点会没有准备就绪 NotReady
7.3 部署kube-proxy组件
7.3.1创建配置文件
cat > /opt/kubernetes/cfg/kube-proxy.conf << EOF
KUBE_PROXY_OPTS=" \\
--v=2 \\
--config=/opt/kubernetes/cfg/kube-proxy-config.yml"
EOF
7.3.2配置参数文件
cat > /opt/kubernetes/cfg/kube-proxy-config.yml << EOF
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
clientConnection:
kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: k8s-master01
clusterCIDR: 10.244.0.0/16
mode: ipvs
ipvs:
scheduler: "rr"
iptables:
masqueradeAll: true
EOF
7.3.3生成kube-proxy.kubeconfig文件
cd /opt/tls/k8s
#创建证书请求文件
cat > kube-proxy-csr.json << EOF
{
"CN": "system:kube-proxy",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
#生成证书
cfssl gencert -ca=k8sca.pem -ca-key=k8sca-key.pem -config=k8sca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
7.3.4生成kube-proxy.kubeconfig文件:
cd /opt/tls/k8s
KUBE_CONFIG="/opt/kubernetes/cfg/kube-proxy.kubeconfig"
KUBE_APISERVER="https://192.168.21.201:6443"
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/kubernetes/ssl/k8sca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-credentials kube-proxy \
--client-certificate=./kube-proxy.pem \
--client-key=./kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=${KUBE_CONFIG}
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=${KUBE_CONFIG}
kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
7.3.5配置systemd管理kube-proxy
cat > /usr/lib/systemd/system/kube-proxy.service << EOF
[Unit]
Description=Kubernetes Proxy
After=network.target
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-proxy.conf
ExecStart=/opt/kubernetes/bin/kube-proxy \$KUBE_PROXY_OPTS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
7.3.6启动并设置开机启动
systemctl daemon-reload
systemctl start kube-proxy
systemctl enable kube-proxy
systemctl stop kube-proxy
systemctl restart kube-proxy
systemctl status kube-proxy
7.3.7检查启动状态
systemctl status kube-proxy |grep Active
ps -ef |grep kube-proxy
7.4 部署网络组件
目前Kubernetes的网络组件有flannel和Calico等多种方案,选择其中一个即可。
说明:flannel和Calico选择其中一个进行部署即可,我们部署Calico。
Calico的版本要和k8s的版本匹配才可以,calico-v3.28.2可以适配kubernetes-v1.31
7.4.1下载Calico
官方网站:
在发布说明中可能没有直接给出安装的yaml文件链接,但这些文件通常位于源代码仓库中的manifests
https://github.com/projectcalico/calico
再打开:
https://github.com/projectcalico/calico/tree/release-3.28/manifests
https://github.com/projectcalico/calico/blob/release-v3.28/manifests/calico.yaml
https://raw.githubusercontent.com/projectcalico/calico/refs/heads/release-v3.28/manifests/calico.yaml
也可以通过以下链接直接访问:(修改3.28.2版本号可以下载相应的版本)
wget https://raw.githubusercontent.com/projectcalico/calico/v3.28.2/manifests/calico.yaml --no-check-certificate
cat calico.yaml |grep image #查看安装calico-v3.28.2所需要的镜像
[root@k8s-master01 ~]# cat calico.yaml |grep image
image: docker.io/calico/cni:v3.28.2
imagePullPolicy: IfNotPresent
image: docker.io/calico/cni:v3.28.2
imagePullPolicy: IfNotPresent
image: docker.io/calico/node:v3.28.2
imagePullPolicy: IfNotPresent
image: docker.io/calico/node:v3.28.2
imagePullPolicy: IfNotPresent
image: docker.io/calico/kube-controllers:v3.28.2
imagePullPolicy: IfNotPresent
7.4.2手动拉取calico-v3.28.2需要的cni、kube-controllers、node镜像版本
(5台机器都需要执行)
ctr -n k8s.io images pull --platform=amd64 docker.io/calico/cni:v3.28.2
ctr -n k8s.io images pull --platform=amd64 docker.io/calico/kube-controllers:v3.28.2
ctr -n k8s.io images pull --platform=amd64 docker.io/calico/node:v3.28.2
#如果无法拉取可以用下面的方法,先拉取镜像再修改标签
ctr -n k8s.io images pull --platform=amd64 docker.rainbond.cc/calico/cni:v3.28.2
ctr -n k8s.io images pull --platform=amd64 docker.rainbond.cc/calico/kube-controllers:v3.28.2
ctr -n k8s.io images pull --platform=amd64 docker.rainbond.cc/calico/node:v3.28.2
ctr -n k8s.io images tag docker.rainbond.cc/calico/cni:v3.28.2 docker.io/calico/cni:v3.28.2
ctr -n k8s.io images tag docker.rainbond.cc/calico/kube-controllers:v3.28.2 docker.io/calico/kube-controllers:v3.28.2
ctr -n k8s.io images tag docker.rainbond.cc/calico/node:v3.28.2 docker.io/calico/node:v3.28.2
#查看默认命名空间的镜像
ctr -n k8s.io images ls
7.4.3安装网络插件calico(只在k8s-master01上安装)
vi calico.yaml #取消注释
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
- name: IP_AUTODETECTION_METHOD #添加此行
value: interface=ens160 #添加此行,指定网卡
:wq! #保存退出,注意格式保证- name 项与前后项对齐,ip地址段和--pod-network-cidr=10.244.0.0/16必须相同
#如果服务器有多个网卡需要指定正确的网卡,我这里是ens160(ip addr命令查看),否则会自动使用第一个网卡,可能会导致网络冲突部署失败
#安装calico网络组件只需要在Kubernetes主节点(k8s-master01)上执行该命令,Calico会自动在集群中的其他节点上进行配置和分发
#所有节点必须要能够拉取cni、kube-controllers、node这些镜像或者是提前离线导入镜像
#后面有任何新的节点加入集群,这些新节点也必须要有这些镜像才可以部署成功
kubectl apply -f calico.yaml #部署calico,只在k8s-master01上安装
kubectl get nodes #部署完成后在k8s-master01上查看节点及POD资源状态已经显示正常了
kubectl get pod -A
kubectl get pod -n kube-system
7.5 授权apiserver访问kubelet
在master节点授权apiserver访问kubelet,如果不进行授权, 将无法管理容器。
cd /opt/k8s
#创建yaml文件
cat > apiserver-to-kubelet-rbac.yaml << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:kube-apiserver-to-kubelet
rules:
- apiGroups:
- ""
resources:
- nodes/proxy
- nodes/stats
- nodes/log
- nodes/spec
- nodes/metrics
- pods/log
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:kube-apiserver
namespace: ""
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:kube-apiserver-to-kubelet
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: kubernetes
EOF
#执行
kubectl apply -f apiserver-to-kubelet-rbac.yaml
八、部署coredns
CoreDNS用于集群内部Service名称解析,所有的master和node节点都要部署。
CoreDNS version in Kubernetes版本兼容性说明
https://github.com/coredns/deployment/blob/master/kubernetes/CoreDNS-k8s_version.md
Kubernetes-v1.31对应CoreDNS-v1.11.3
8.1 下载coredns.yaml文件
在线下载地址:
https://github.com/kubernetes/kubernetes/tree/release-1.31/cluster/addons/dns/coredns
https://github.com/kubernetes/kubernetes/blob/release-1.31/cluster/addons/dns/coredns/coredns.yaml.base
在K8s安装包里面已经有coredns.yaml
cd /opt/k8s/soft/kubernetes
tar zxvf kubernetes-src.tar.gz
cd /opt/k8s/soft/kubernetes/cluster/addons/dns/coredns
cp coredns.yaml.base /opt/k8s/coredns.yaml
8.2 准备coredns.yaml需要用到的镜像
cd /opt/k8s/
cat coredns.yaml |grep image|uniq
image: registry.k8s.io/coredns/coredns:v1.11.3
imagePullPolicy: IfNotPresent
#准备以上镜像(5台机器都需要执行)
ctr -n k8s.io images pull registry.aliyuncs.com/google_containers/coredns:v1.11.3
#修改标签
ctr -n k8s.io images tag registry.aliyuncs.com/google_containers/coredns:v1.11.3 registry.k8s.io/coredns/coredns:v1.11.3
#查看默认命名空间的镜像
ctr -n k8s.io images ls
8.3 修改coredns.yaml文件参数
cd /opt/k8s/
vi coredns.yaml
8.3.1修改DNS服务的ip地址
clusterIP: __DNS__SERVER__ #查找此行
clusterIP: 10.0.0.2 #修改为此行
#注意:此处的NDS服务器ip地址必须与kube-apiserver配置文件/opt/kubernetes/cfg/kube-apiserver.conf中设置的一致(属于同一网段)
--service-cluster-ip-range=10.0.0.0/24
8.3.2指定内存大小
memory: __DNS__MEMORY__LIMIT__ #查找此行
memory: 1024Mi #修改为此行
8.3.3修改为本地域
kubernetes __DNS__DOMAIN__ in-addr.arpa ip6.arpa #查找此行
kubernetes cluster.local in-addr.arpa ip6.arpa #修改为此行
8.4 部署coredns
cd /opt/k8s/
kubectl apply -f coredns.yaml #只在k8s-master01上执行
#查看服务
kubectl get pods -n kube-system
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl get svc -n kube-system -l k8s-app=kube-dns
查看k8s默认的svc(kubernetes及kube-dns)
kubectl get svc -n kube-system
kubectl get svc
8.5 dns解析测试
下载测试镜像:
注意:busybox<=1.28.4,从1.28.4以后的镜像用来测试都存在问题。
ctr -n k8s.io images pull docker.rainbond.cc/busybox:1.28.4
#修改镜像标签
ctr -n k8s.io images tag docker.rainbond.cc/busybox:1.28.4 docker.io/library/busybox:1.28.4
#查看默认命名空间的镜像
ctr -n k8s.io images ls |grep busybox:1.28.4
#运行镜像
kubectl run -it --rm dns-test --image=busybox:1.28.4 sh
If you don't see a command prompt, try pressing enter.
/ # nslookup Kubernetes #输入命令
Server: 10.0.0.2
Address 1: 10.0.0.2 kube-dns.kube-system.svc.cluster.local
Name: kubernetes
Address 1: 10.0.0.1 kubernetes.default.svc.cluster.local
/ #
#看到如上所示,说明解析没问题。
九、部署dashboard
9.1 安装Helm
Helm 是 K8S的包管理器,我们需要使用Helm来部署Dashboard(5台机器都需要执行)
官方网站:
https://helm.sh/zh/
https://github.com/helm/helm
下载地址:https://get.helm.sh/helm-v3.16.2-linux-amd64.tar.gz
#安装heml
cd /opt/k8s/soft/helm
tar zxvf helm-v3.16.2-linux-amd64.tar.gz #解压
mv linux-amd64/helm /usr/local/bin/helm
helm version #查看版本
9.2 下载对应的Dashboard
Dashboard是k8s官方提供的一个UI,可用于基本管理K8s资源,Kubernetes和Dashboard的版本要对应才可以
Kubernetes version dashboard version
1.18 v2.0.0 完全支持
1.19 v2.0.4 完全支持
1.20 v2.4.0 完全支持
1.21 v2.4.0 完全支持
1.23 v2.5.0 完全支持
1.24 v2.6.0 完全支持
1.25 v2.7.0 完全支持
1.27 v3.0.0-alpha0 完全支持
1.29 kubernetes-dashboard-7.5.0 完全支持
1.31 kubernetes-dashboard-7.8.0 完全支持
我们使用的k8s版本为1.31.1,所以要下载kubernetes-dashboard-7.8.0版本部署
https://github.com/kubernetes/dashboard/releases/tag/kubernetes-dashboard-7.8.0
从官方网站可以看到安装部署kubernetes-dashboard-7.8.0需要用到下面的镜像
docker.io/kubernetesui/dashboard-api:1.9.0
docker.io/kubernetesui/dashboard-auth:1.2.0
docker.io/kubernetesui/dashboard-metrics-scraper:1.2.0
docker.io/kubernetesui/dashboard-web:1.5.0
#下载文件
wget https://github.com/kubernetes/dashboard/releases/download/kubernetes-dashboard-7.8.0/kubernetes-dashboard-7.8.0.tgz
从上面的文件中可以查询到还需要一个镜像
docker.io/library/kong:3.6
#下载镜像文件
#可以从这里下载
ctr -n k8s.io images pull docker.rainbond.cc/kubernetesui/dashboard-api:1.9.0
ctr -n k8s.io images pull docker.rainbond.cc/kubernetesui/dashboard-auth:1.2.0
ctr -n k8s.io images pull docker.rainbond.cc/kubernetesui/dashboard-metrics-scraper:1.2.0
ctr -n k8s.io images pull docker.rainbond.cc/kubernetesui/dashboard-web:1.5.0
ctr -n k8s.io images pull docker.rainbond.cc/library/kong:3.6
#修改镜像标签
ctr -n k8s.io i tag docker.rainbond.cc/kubernetesui/dashboard-api:1.9.0 docker.io/kubernetesui/dashboard-api:1.9.0
ctr -n k8s.io i tag docker.rainbond.cc/kubernetesui/dashboard-auth:1.2.0 docker.io/kubernetesui/dashboard-auth:1.2.0
ctr -n k8s.io i tag docker.rainbond.cc/kubernetesui/dashboard-metrics-scraper:1.2.0 docker.io/kubernetesui/dashboard-metrics-scraper:1.2.0
ctr -n k8s.io i tag docker.rainbond.cc/kubernetesui/dashboard-web:1.5.0 docker.io/kubernetesui/dashboard-web:1.5.0
ctr -n k8s.io i tag docker.rainbond.cc/library/kong:3.6 docker.io/library/kong:3.6
#查看默认命名空间的镜像
ctr -n k8s.io images ls
9.3 安装kubernetes-dashboard
#只在k8s-master01这台执行
#安装部署
helm upgrade --install kubernetes-dashboard ./kubernetes-dashboard-7.8.0.tgz --create-namespace --namespace kubernetes-dashboard
#查看信息
kubectl -n kubernetes-dashboard get svc
kubectl get pod --all-namespaces
#编辑配置文件
kubectl edit svc kubernetes-dashboard-kong-proxy -n kubernetes-dashboard
#默认Dashboard只能集群内部访问,修改Service为NodePort类型,暴露到外部
ports:
- name: kong-proxy-tls
nodePort: 31900
port: 443
protocol: TCP
targetPort: 8443
nodePort: 30001 #添加此行
selector:
app.kubernetes.io/component: app
app.kubernetes.io/instance: kubernetes-dashboard
app.kubernetes.io/name: kong
sessionAffinity: None
type: NodePort #修改此行参数为NodePort
:wq! #保存退出
#查看容器运行状态
kubectl get serviceAccount,svc,deploy,pod -n kubernetes-dashboard
#访问Dashboard
输入集群的任何一个ip都可以访问
打开页面https://192.168.21.201:30001
提示需要 Bearer tokens才能登录
9.4 创建Bearer tokens
在Kubernetes创建Dashboard超级管理员账户(只在k8s-master01上安装)
#创建用户
kubectl create serviceaccount admin-user -n kube-system
#创建token
cat > dashboard-user-token.yaml << EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kube-system
EOF
#执行,如果报错,特别要检查上面的yaml文件格式,前后缩进
kubectl apply -f dashboard-user-token.yaml
#获取Bearer Token
kubectl -n kube-system create token admin-user
#登录Dashboard页面
#在任意一台机器上面都能访问
#这里我们在k8s-master01节点访问https://192.168.21.201:30001
在页面上粘贴Token,登录成功
十、测试使用k8s集群部署服务
10.1 部署nginx服务
创建一个Nginx服务,副本数为2,并且使用镜像加速站docker.rainbond.cc来拉取镜像
#拉取镜像,所有节点都执行
ctr -n k8s.io images pull docker.rainbond.cc/nginx:latest
#只在k8s-master01上执行
kubectl create deployment app-nginx --image=docker.rainbond.cc/nginx:latest --replicas=2 -n default
#暴露端口从外部访问
kubectl expose deployment app-nginx --type=NodePort --name=nginx-service --port=80 --target-port=80
#查看端口映射
kubectl get services
nginx-service NodePort 10.0.0.164 <none> 80:42153/TCP 10s
#在任意一台机器上面都能访问
#打开下面的页面就能访问nginx服务了
http://192.168.21.201:42153/
注意:
1.在NodePort类型的服务中,外部访问通常是通过某个特定的NodePort端口,而不是直接通过80端口
2.Kubernetes会为服务分配一个在30000到50000范围内的随机端口(不支持命令行自定义端口)
3.外部用户需要通过Node的IP地址和这个NodePort端口进行访问
4.直接通过80端口访问nginx-service是不可行的
5.如果您希望通过80端口访问,建议使用LoadBalancer或者Ingress
#查看节点信息的命令
#默认情况下master节点不参与工作调度
kubectl describe nodes k8s-master01
curl -k https://192.168.21.201:6443/version
#删除app-nginx 部署
kubectl delete deployment app-nginx -n default
#删除nginx-service服务
kubectl delete service nginx-service -n default
十一、增加k8s-node节点
11.1 拷贝已部署好的Node节点相关文件到新节点
在k8s-master01节点将node涉及文件拷贝到两台node节点上
scp -r /opt/kubernetes root@192.168.21.204:/opt/
scp -r /usr/lib/systemd/system/{kubelet,kube-proxy}.service root@192.168.21.204:/usr/lib/systemd/system
scp -r /opt/kubernetes root@192.168.21.205:/opt/
scp -r /usr/lib/systemd/system/{kubelet,kube-proxy}.service root@192.168.21.205:/usr/lib/systemd/system
11.2 删除节点上面kubelet证书和kubeconfig文件
rm -f /opt/kubernetes/cfg/kubelet.kubeconfig
rm -f /opt/kubernetes/ssl/kubelet*
注:这几个文件是证书申请审批后自动生成的,每个节点不同,必须删除。
11.3 修改node节点配置文件里面的主机名
在192.168.21.204修改
vi /opt/kubernetes/cfg/kubelet.conf
--hostname-override=k8s-node01
:wq! #保存退出
vi /opt/kubernetes/cfg/kube-proxy-config.yml
hostnameOverride: k8s-node01
:wq! #保存退出
在192.168.21.205修改
vi /opt/kubernetes/cfg/kubelet.conf
--hostname-override=k8s-node02
:wq! #保存退出
vi /opt/kubernetes/cfg/kube-proxy-config.yml
hostnameOverride: k8s-node02
:wq! #保存退出
11.4 启动kubelet并设置开机启动
systemctl daemon-reload
systemctl start kubelet
systemctl enable kubelet
systemctl stop kubelet
systemctl restart kubelet
systemctl status kubelet
11.5 启动kube-proxy并设置开机启动
systemctl daemon-reload
systemctl start kube-proxy
systemctl enable kube-proxy
systemctl stop kube-proxy
systemctl restart kube-proxy
systemctl status kube-proxy
#检查进程
ps -ef | grep kubelet
ps -ef | grep kube-proxy
11.6 在Master上批准新Node kubelet证书申请
kubectl get csr #在k8s-master01上执行
# 授权请求
kubectl certificate approve node-csr-4348wiZ5sJ2HZgpiZ8wGYIMlcbH6epjY0F3Hzdj4osU
kubectl certificate approve node-csr-KZkcNZD7j2OZLl1sTF6EG6ziy1u-PH5kwzV_m4Xy8RU
#查看Node状态
kubectl get node
十二、扩容k8s-master节点
12.1 拷贝已部署好的k8s-master01节点相关文件到新节点
这里我们先拷贝到k8s-master02上
scp -r /opt/kubernetes root@192.168.21.202:/opt
scp -r /opt/etcd/ssl root@192.168.21.202:/opt/etcd
scp /usr/lib/systemd/system/kube* root@192.168.21.202:/usr/lib/systemd/system
scp /usr/bin/kubectl root@192.168.21.202:/usr/bin
scp /usr/local/bin/kubectl root@192.168.21.202:/usr/local/bin
scp -r /root/.kube root@192.168.21.202:/root
12.2 删除kubelet证书和kubeconfig文件
rm -f /opt/kubernetes/cfg/kubelet.kubeconfig
rm -f /opt/kubernetes/ssl/kubelet*
注:这几个文件是证书申请审批后自动生成的,每个节点不同,必须删除。
12.3 修改配置文件IP和主机名
修改kube-apiserver、kube-controller-manager、kube-scheduler、kubelet、kube-proxy配置文件为本地IP和主机名
vi /opt/kubernetes/cfg/kube-apiserver.conf
...
--bind-address=192.168.21.202 \
--advertise-address=192.168.21.202 \
...
:wq! #保存退出
vi /opt/kubernetes/cfg/kube-controller-manager.kubeconfig
server: https://192.168.21.202:6443
:wq! #保存退出
vi /opt/kubernetes/cfg/kube-scheduler.kubeconfig
server: https://192.168.21.202:6443
:wq! #保存退出
vi /opt/kubernetes/cfg/kubelet.conf
--hostname-override=k8s-master02
:wq! #保存退出
vi ~/.kube/config
...
server: https://192.168.21.202:6443
:wq! #保存退出
vi /opt/kubernetes/cfg/kube-proxy-config.yml
hostnameOverride: k8s-master02
:wq! #保存退出
12.4 启动服务并设置开机启动
systemctl daemon-reload
systemctl start kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy
systemctl enable kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy
kubectl get cs #查看集群状态
12.5 批准kubelet证书申请
# 在k8s-master01上查看证书请求
kubectl get csr
node-csr-z9wRarhRp_MoTcUSRzDPSKr7Z-AItfHdLHNNwLcf7A0 6m36s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap <none> Pending
#授权请求
kubectl certificate approve node-csr-z9wRarhRp_MoTcUSRzDPSKr7Z-AItfHdLHNNwLcf7A0
#查看node
kubectl get node
#重复操作12.1-12.5的步骤,把k8s-master03也加入到master集群中
十三、部署HAProxy+Keepalived实现k8s高可用负载均衡
13.1 安装HAProxy
在k8s-master01、k8s-master02、k8s-master03这3台执行
13.1.1下载安装包haproxy-3.0.6.tar.gz到/data/soft目录
wget https://www.haproxy.org/download/3.0/src/haproxy-3.0.6.tar.gz
#安装依赖包
yum -y install make gcc gcc-c++ pcre-devel bzip2-devel openssl-devel systemd-devel zlib-devel lua psmisc
mkdir -p /data/server/haproxy #创建安装目录
groupadd haproxy #添加haproxy组
useradd -g haproxy haproxy -s /bin/false #创建运行账户haproxy并加入到haproxy组,不允许haproxy用户直接登录系统
cd /data/soft
tar zxvf haproxy-3.0.6.tar.gz
cd haproxy-3.0.6
make -j 2 TARGET=linux-glibc PREFIX=/data/server/haproxy
make install PREFIX=/data/server/haproxy
cp haproxy /usr/sbin/
vi /etc/sysctl.conf #配置内核参数
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
:wq! #保存退出,如果之前已经设置过此参数无需重复设置
/sbin/sysctl -p
mkdir -p /var/lib/haproxy #创建目录
chown haproxy:haproxy /var/lib/haproxy -R
mkdir /data/server/haproxy/conf #创建配置文件目录
mkdir /data/server/haproxy/logs #创建日志文件目录
cp examples/option-http_proxy.cfg /data/server/haproxy/conf/haproxy.cfg #拷贝配置文件
#添加开机启动
vi /usr/lib/systemd/system/haproxy.service
[Unit]
Description=HAProxy Load Balancer
After=syslog.target network-online.target
Requires=network-online.target
[Service]
ExecStartPre=/data/server/haproxy/sbin/haproxy -f /data/server/haproxy/conf/haproxy.cfg -c -q
ExecStart=/data/server/haproxy/sbin/haproxy -Ws -f /data/server/haproxy/conf/haproxy.cfg -p /var/run/haproxy.pid
ExecReload=/bin/kill -USR2 $MAINPID
[Install]
WantedBy=multi-user.target
:wq! #保存退出
systemctl daemon-reload
systemctl enable haproxy.service
13.1.2配置haproxy(在k8s-master01、k8s-master02、k8s-master03这3台执行)
mv /data/server/haproxy/conf/haproxy.cfg /data/server/haproxy/conf/haproxy.cfg.bak
# k8s-master01、k8s-master02、k8s-master03这3台配置都一样
vi /data/server/haproxy/conf/haproxy.cfg
global
log /dev/log local0 warning
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats
defaults
log global
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
frontend kube-apiserver
bind *:16443
mode tcp
option tcplog
default_backend kube-apiserver
backend kube-apiserver
mode tcp
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server kube-apiserver-1 192.168.21.201:6443 check # Replace the IP address with your own.
server kube-apiserver-2 192.168.21.202:6443 check # Replace the IP address with your own.
server kube-apiserver-3 192.168.21.203:6443 check # Replace the IP address with your own.
listen stats
mode http
bind *:8888
stats auth admin:123456
stats refresh 5s
stats uri /stats
log 127.0.0.1 local3 err
:wq! #保存退出
#检查配置文件
/data/server/haproxy/sbin/haproxy -f /data/server/haproxy/conf/haproxy.cfg -c
/data/server/haproxy/sbin/haproxy -f /data/server/haproxy/conf/haproxy.cfg -db
systemctl daemon-reload
systemctl restart haproxy.service
浏览器打开http://192.168.21.201:8888/stats
输入账号密码可以查询状态信息
现在haproxy负载均衡已经设置好了
13.2 安装Keepalived
在k8s-master01、k8s-master02、k8s-master03这3台执行
13.2.1下载安装包keepalived-2.3.2.tar.gz到/data/soft目录
mkdir -p /data/soft
wget https://www.keepalived.org/software/keepalived-2.3.2.tar.gz
#安装依赖包
yum -y install openssl-devel popt-devel libnl3-devel libnfnetlink-devel kernel-devel gcc psmisc
mkdir -p /data/server/keepalived #创建安装目录
#编译安装keepalived
cd /data/soft
tar zxvf keepalived-2.3.2.tar.gz
cd keepalived-2.3.2
./configure --prefix=/data/server/keepalived #配置,必须看到以下提示,说明配置正确,才能继续安装
Use IPVS Framework : Yes
Use VRRP Framework : Yes
make #编译
make install #安装
/data/server/keepalived/sbin/keepalived -v #查看版本
#拷贝配置文件
mkdir -p /etc/keepalived
cp /data/server/keepalived/etc/sysconfig/keepalived /etc/sysconfig/keepalived
cp /data/server/keepalived/etc/keepalived/keepalived.conf.sample /etc/keepalived/keepalived.conf
cp /data/server/keepalived/sbin/keepalived /usr/sbin/
mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
#修改启动文件
mv /usr/lib/systemd/system/keepalived.service /usr/lib/systemd/system/keepalived.service.bak
vi /usr/lib/systemd/system/keepalived.service
[Unit]
Description=LVS and VRRP High Availability Monitor
After=syslog.target network.target haproxy.service
Requires=network-online.target haproxy.service
[Service]
Type=forking
PIDFile=/run/keepalived.pid
KillMode=process
EnvironmentFile=-/etc/sysconfig/keepalived
ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
:wq! #保存退出
#参数说明
After=haproxy.service:确保 keepalived 在 haproxy 启动后再启动。
Requires=haproxy.service:确保 keepalived 服务依赖于 haproxy,如果 haproxy 没有启动或失败,则 keepalived 也不会启动。
13.2.2配置Keepalived(分别在k8s-master01、k8s-master02、k8s-master03这3台操作)
k8s-master01配置文件
vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script chk_haproxy {
script "killall -0 haproxy" #使用这个命令来检测进程
interval 2
weight -5
}
vrrp_instance VI_1 {
state MASTER #主节点
interface ens160 #绑定vip地址的网卡名称
virtual_router_id 51 #虚拟路由的ID,3个节点要一致
priority 101 #优先级,数字越大优先级越高,取值范围:0-254
advert_int 1
authentication {
auth_type PASS #VRRP验证类型:PASS、AH两种
auth_pass 1111 #VRRP验证密码,在同一个vrrp_instance下,主、从必须使用相同的密码才能正常通信
}
track_script {
chk_haproxy
}
virtual_ipaddress {
192.168.21.200/32 #vip地址
}
unicast_src_ip 192.168.21.201 #本地网卡ens160的IP地址
unicast_peer {
192.168.21.202 #k8s-master02的ip地址
192.168.21.203 #k8s-master03的ip地址
}
}
:wq! #保存退出
k8s-master02配置文件
vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script chk_haproxy {
script "killall -0 haproxy" #使用这个命令来检测进程
interval 2
weight -5
}
vrrp_instance VI_1 {
state BACKUP #备节点
interface ens160 #绑定vip地址的网卡名称
virtual_router_id 51 #虚拟路由的ID,3个节点要一致
priority 100 #优先级,数字越大优先级越高,取值范围:0-254
advert_int 1
authentication {
auth_type PASS #VRRP验证类型:PASS、AH两种
auth_pass 1111 #VRRP验证密码,在同一个vrrp_instance下,主、从必须使用相同的密码才能正常通信
}
track_script {
chk_haproxy
}
virtual_ipaddress {
192.168.21.200/32 #vip地址
}
unicast_src_ip 192.168.21.202 #本地网卡ens160的IP地址
unicast_peer {
192.168.21.201 #k8s-master01的ip地址
}
}
:wq! #保存退出
k8s-master03配置文件
vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script chk_haproxy {
script "killall -0 haproxy" #使用这个命令来检测进程
interval 2
weight -5
}
vrrp_instance VI_1 {
state BACKUP #备节点
interface ens160 #绑定vip地址的网卡名称
virtual_router_id 51 #虚拟路由的ID,3个节点要一致
priority 99 #优先级,数字越大优先级越高,取值范围:0-254
advert_int 1
authentication {
auth_type PASS #VRRP验证类型:PASS、AH两种
auth_pass 1111 #VRRP验证密码,在同一个vrrp_instance下,主、从必须使用相同的密码才能正常通信
}
track_script {
chk_haproxy
}
virtual_ipaddress {
192.168.21.200/32 #vip地址
}
unicast_src_ip 192.168.21.203 #本地网卡ens160的IP地址
unicast_peer {
192.168.21.201 #k8s-master01的ip地址
}
}
:wq! #保存退出
systemctl daemon-reload #重新加载
systemctl enable keepalived.service #设置开机自动启动
systemctl start keepalived.service #启动
systemctl stop keepalived.service #停止
systemctl restart keepalived.service #重启
systemctl status keepalived.service #查看状态
journalctl -u keepalived -f #查看日志
ip addr #查看ip地址,可以看到vip地址192.168.21.200已经绑定在k8s-master01上面了
这里把haproxy的状态和API Server(kube-apiServer)服务看做是一个整体来进行监控
认为只要haproxy进程在,API Server(kube-apiServer)服务也就正常
认为只要haproxy进程不存在,API Server(kube-apiServer)服务也就异常,会关闭keepalived服务,进行高可用切换
要确保keepalived在haproxy启动后再启动,因为keepalived配置文件里面有检测haproxy的脚本
如果检测到haproxy没有启动,会认为节点故障了,进行vip飘逸
所有节点的haproxy都没有启动,会形成死循环,最终会导致所有节点都无法绑定vip,认为所有master节点都坏掉了,导致k8s集群无法使用
13.3 验证Keepalived+haproxy高可用
关闭k8s-master01上面的haproxy服务systemctl stop haproxy.service
这个时候查看k8s-master01上的ip地址ip addr,发现vip地址192.168.21.200已经没有了
这个时候查看k8s-master02上的ip地址ip addr,发现vip地址192.168.21.200在这台上面
启动k8s-master01上面的haproxy服务systemctl start haproxy.service
这个时候查看k8s-master01上的ip地址ip addr,因为3台服务器,k8s-master01的vip优先级最高,vip地址192.168.21.200又重新飘逸到这台上面
这个时候k8s集群的高可用已经配置成功
在K8s集群中任意一个节点,使用curl查看K8s版本测试,使用vip访问:
curl -k https://192.168.21.200:16443/version
kubectl describe nodes k8s-master01 #查看节点信息
13.4 修改所有节点连接vip地址
在k8s-master01、k8s-master02、k8s-master03、k8s-node01、k8s-node02这5台上都执行
所有节点连接的是k8s-master01的ip地址,需要修改为vip地址,实现k8s高可用和负载均衡
cd /opt/kubernetes/cfg/
bootstrap.kubeconfig
kubelet.kubeconfig
kube-proxy.kubeconfig
修改所有节点的3个配置文件(如上所示),由原来192.168.21.201:6443修改为VIP地址192.168.21.200:16443
#执行替换
sed -i 's#192.168.21.201:6443#192.168.21.200:16443#' /opt/kubernetes/cfg/bootstrap.kubeconfig
sed -i 's#192.168.21.201:6443#192.168.21.200:16443#' /opt/kubernetes/cfg/kubelet.kubeconfig
sed -i 's#192.168.21.201:6443#192.168.21.200:16443#' /opt/kubernetes/cfg/kube-proxy.kubeconfig
systemctl restart kubelet kube-proxy #重启服务
检查节点状态:
kubectl get node
kubectl describe nodes
kubectl get pods --all-namespaces -o wide
kubectl describe nodes k8s-master01
十四、设置k8s-master节点可调度与不可调度
在Kubernetes集群中,默认情况下,master节点(通常用于维护集群的控制平面)不会参与工作负载调度,这是为了确保集群的控制平面保持高可用性和稳定性,不会被常规的服务负载影响。
14.1 查看k8s-master节点状态
kubectl describe nodes k8s-master01
kubectl describe nodes k8s-master02
kubectl describe nodes k8s-master03
找到类似:
Annotations: node.kubernetes.io/node=
#继续查看节点信息
kubectl describe nodes k8s-master01 |grep Taints:
kubectl describe nodes k8s-master02 |grep Taints:
kubectl describe nodes k8s-master03 |grep Taints:
显示为:
Taints: node.kubernetes.io/node:NoSchedule
表示不参与调度
显示为:Taints: <none>
表示可以调度
14.2 设置k8s-master节点可以调度
#去掉污点,可以调度
kubectl taint nodes k8s-master01 node.kubernetes.io/node-
kubectl taint nodes k8s-master02 node.kubernetes.io/node-
kubectl taint nodes k8s-master03 node.kubernetes.io/node-
#再次查看信息
kubectl describe nodes k8s-master01 |grep Taints:
kubectl describe nodes k8s-master02 |grep Taints:
kubectl describe nodes k8s-master03 |grep Taints:
14.3 设置k8s-master节点不参与调度
#添加污点,不参与调度
kubectl taint node k8s-master01 node.kubernetes.io/node="":NoSchedule
kubectl taint node k8s-master02 node.kubernetes.io/node="":NoSchedule
kubectl taint node k8s-master03 node.kubernetes.io/node="":NoSchedule
至此,使用二进制方式部署k8s高可用集群(V1.31版本)完成。