技术交流QQ群:①185473046   ②190706903   ③203744115   网站地图
登录

下次自动登录
现在的位置: 首页Kubernetes>正文
k8s集群搭建之部署Etcd集群
2022年03月28日 Kubernetes 暂无评论 ⁄ 被围观 9,736次+

Etcd是一个分布式键值存储系统,Kubernetes使用Etcd进行数据存储,所以要先准备一个Etcd数据库,为解决Etcd单点故障,应采用集群方式部署,这里使用3台组建集群,可容忍1台机器故障。由于Etcd集群需要选举产生 leader,所以集群节点数目需要为奇数来保证正常进行选举。

说明:

使用5台组建集群,可容忍2台机器故障

使用7台组建集群,可容忍3台机器故障,

使用9台组建集群,可容忍4台机器故障

etcd集群也可以与k8s节点机器复用,只要apiserver能连接到就行。

这里使用三台服务器单独部署etcd集群

k8s集群搭建之安装cfssl证书生成工具

https://www.osyunwei.com/archives/12072.html

先在一台k8s-etcd服务器上操作

1、生成Etcd证书

1.1自签etcd证书颁发机构(CA)

创建工作目录

mkdir -p /opt/tls/etcd

cd /opt/tls/etcd

创建ca配置文件

cat > etcdca-config.json << EOF

{

"signing": {

"default": {

"expiry": "87600h"

},

"profiles": {

"etcd": {

"expiry": "87600h",

"usages": [

"signing",

"key encipherment",

"server auth",

"client auth"

]

}

}

}

}

EOF

创建ca证书签名请求文件

cat > etcdca-csr.json << EOF

{

"CN": "etcdca",

"key": {

"algo": "rsa",

"size": 2048

},

"names": [

{

"C": "CN",

"L": "Beijing",

"ST": "Beijing"

}

]

}

EOF

生成证书(etcdca.pem和etcdca-key.pem)命令:

cfssl gencert -initca etcdca-csr.json | cfssljson -bare etcdca –

1.2使用自签CA签发Etcd HTTPS证书

创建证书申请文件:

cd /opt/tls/etcd

cat > etcd-csr.json << EOF

{

"CN": "etcd",

"hosts": [

"127.0.0.1",

"192.168.21.201",

"192.168.21.202",

"192.168.21.203",

"k8s-master01",

"k8s-master02",

"k8s-master03"

],

"key": {

"algo": "rsa",

"size": 2048

},

"names": [

{

"C": "CN",

"L": "BeiJing",

"ST": "BeiJing"

}

]

}

EOF

注:上述文件hosts字段中ip为etcd集群服务器ip地址,一个都不能少,为了方便后期扩容可以多写几个规划的ip。

生成证书命令(etcd.pem和etcd-key.pem):

cfssl gencert -ca=etcdca.pem -ca-key=etcdca-key.pem -config=etcdca-config.json -profile=etcd etcd-csr.json | cfssljson -bare etcd

#证书文件已经生成了,在/opt/tls/etcd目录,后面会用到。

2、安装Etcd

使用二进制文件来安装,先在一台k8s-master01服务器上操作

#下载二进制软件包etcd-v3.5.16-linux-amd64.tar.gz

https://github.com/etcd-io/etcd/releases/download/v3.5.16/etcd-v3.5.16-linux-amd64.tar.gz

#创建工作目录并解压二进制包

mkdir /opt/etcd/{bin,cfg,ssl,data,check} -p

tar zxvf etcd-v3.5.16-linux-amd64.tar.gz

mv etcd-v3.5.16-linux-amd64/etcd* /opt/etcd/bin/

#添加执行权限

chmod +x /opt/etcd/bin/*

vi /etc/profile #把etcd服务加入系统环境变量,在最后添加下面这一行

export PATH=$PATH:/opt/etcd/bin/

:wq! #保存退出

source /etc/profile #使配置立即生效

#查看版本

etcd --version

3、创建Etcd配置文件

cat > /opt/etcd/cfg/etcd.yaml << EOF

# [Member]

name: "k8s-master01"

data-dir: "/opt/etcd/data/"

wal-dir: "/opt/etcd/data/"

listen-peer-urls: "https://192.168.21.201:2380"

listen-client-urls: "https://192.168.21.201:2379,https://127.0.0.1:2379"

logger: "zap"

# [Clustering]

initial-advertise-peer-urls: "https://192.168.21.201:2380"

advertise-client-urls: "https://192.168.21.201:2379"

initial-cluster: "k8s-master01=https://192.168.21.201:2380,k8s-master02=https://192.168.21.202:2380,k8s-master03=https://192.168.21.203:2380"

initial-cluster-token: "etcd-cluster"

initial-cluster-state: "new"

# [Security]

client-transport-security:

cert-file: "/opt/etcd/ssl/etcd.pem"

key-file: "/opt/etcd/ssl/etcd-key.pem"

client-cert-auth: true

trusted-ca-file: "/opt/etcd/ssl/etcdca.pem"

auto-tls: true

peer-transport-security:

key-file: "/opt/etcd/ssl/etcd-key.pem"

cert-file: "/opt/etcd/ssl/etcd.pem"

client-cert-auth: true

trusted-ca-file: "/opt/etcd/ssl/etcdca.pem"

auto-tls: true

EOF

#特别注意yaml文件的格式缩进

4、设置systemd管理Etcd

cat > /usr/lib/systemd/system/etcd.service << EOF

[Unit]

Description=Etcd Server

After=network.target

After=network-online.target

Wants=network-online.target

[Service]

User=root

Type=notify

WorkingDirectory=/opt/etcd/data/

Restart=always

#Restart=on-failure

RestartSec=10s

LimitNOFILE=65536

ExecStart=/opt/etcd/bin/etcd --config-file=/opt/etcd/cfg/etcd.yaml

[Install]

WantedBy=multi-user.target

EOF

5、拷贝Etcd证书文件

cp /opt/tls/etcd/etcdca.pem /opt/etcd/ssl/

cp /opt/tls/etcd/etcdca-key.pem /opt/etcd/ssl/

cp /opt/tls/etcd/etcd.pem /opt/etcd/ssl/

cp /opt/tls/etcd/etcd-key.pem /opt/etcd/ssl/

6、分发Etcd安装配置文件

在其中一台k8s-master01服务器上操作完成之后,需要把etcd安装配置文件分发到etcd集群内所有节点上。

当然也可以在集群内每一台服务器上重复上面的步骤进行安装。

scp -r /opt/etcd/ root@192.168.21.202:/opt/

scp -r /opt/etcd/ root@192.168.21.203:/opt/

scp /usr/lib/systemd/system/etcd.service root@192.168.21.202:/usr/lib/systemd/system/

scp /usr/lib/systemd/system/etcd.service root@192.168.21.203:/usr/lib/systemd/system/

#然后在两台服务器上分别修改etcd.yaml配置文件中的主机名和当前服务器ip地址

vi /opt/etcd/cfg/etcd.yaml

# [Member]

name: "k8s-master01" #修改为每个节点自己的主机名

data-dir: "/opt/etcd/data/"

wal-dir: "/opt/etcd/data/"

listen-peer-urls: "https://192.168.21.201:2380" #修改为每个节点自己的ip地址

#修改为每个节点自己的ip地址

listen-client-urls: "https://192.168.21.201:2379,https://127.0.0.1:2379"

logger: "zap"

# [Clustering]

initial-advertise-peer-urls: "https://192.168.21.201:2380" #修改为每个节点自己的ip地址

advertise-client-urls: "https://192.168.21.201:2379" #修改为每个节点自己的ip地址

#下面的参数三个节点都一样

initial-cluster: "k8s-master01=https://192.168.21.201:2380,k8s-master02=https://192.168.21.202:2380,k8s-master03=https://192.168.21.203:2380"

initial-cluster-token: "etcd-cluster"

initial-cluster-state: "new"

# [Security]

client-transport-security:

cert-file: "/opt/etcd/ssl/etcd.pem"

key-file: "/opt/etcd/ssl/etcd-key.pem"

client-cert-auth: true

trusted-ca-file: "/opt/etcd/ssl/etcdca.pem"

auto-tls: true

peer-transport-security:

key-file: "/opt/etcd/ssl/etcd-key.pem"

cert-file: "/opt/etcd/ssl/etcd.pem"

client-cert-auth: true

trusted-ca-file: "/opt/etcd/ssl/etcdca.pem"

auto-tls: true

:wq! #保存退出

#在两台服务器上操作,把etcd服务加入系统环境变量

vi /etc/profile #在最后添加下面这一行

export PATH=$PATH:/opt/etcd/bin/

:wq! #保存退出

source /etc/profile #使配置立即生效

7、启动Etcd并设置开机启动

同时启动三台服务器上的etcd

systemctl daemon-reload

systemctl enable etcd

systemctl start etcd

如果有问题先看日志: journalctl -u etcd systemctl status etcd

然后根据日志提示再排查解决问题

8、查看集群状态

ETCDCTL_API=3 /opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379" --write-out=table endpoint health

+-----------------------------+--------+-------------+-------+

| ENDPOINT | HEALTH | TOOK | ERROR |

+-----------------------------+--------+-------------+-------+

| https://192.168.21.203:2379 | true | 20.816075ms | |

| https://192.168.21.202:2379 | true | 22.193996ms | |

| https://192.168.21.201:2379 | true | 21.069051ms | |

+-----------------------------+--------+-------------+-------+

vi /opt/etcd/check/check_etcd.sh

#!/bin/bash

# 设置基本参数

ETCDCTL="/opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --write-out=table --endpoints=https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"

# 检查是否设置环境变量

export ETCDCTL_API=3

# 根据输入参数执行不同的操作

case "$1" in

"health")

echo "Checking cluster endpoint health..."

$ETCDCTL endpoint health

;;

"status")

echo "Listing all endpoint statuses..."

$ETCDCTL endpoint status

;;

"list")

echo "Listing all cluster members..."

$ETCDCTL member list

;;

*)

echo "Usage: $0 {health|status|list}"

echo "Please specify a valid command."

exit 1

;;

esac

:wq! #保存退出

chmod +x /opt/etcd/check/check_etcd.sh #添加执行权限

sh /opt/etcd/check/check_etcd.sh health

sh /opt/etcd/check/check_etcd.sh status

sh /opt/etcd/check/check_etcd.sh list

9、模拟集群节点故障

模拟集群中1个节点192.168.21.203发生故障,故障修复后,必须以新的身份加入集群

systemctl stop etcd #停止etcd服务

5.9.1检查集群健康状态

/opt/etcd/bin/etcdctl endpoint health -w table --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"

/opt/etcd/bin/etcdctl member list -w table --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"

#查询到故障节点的ID

5.9.2移除故障节点

/opt/etcd/bin/etcdctl member remove 83045a3c3a751464 --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"

5.9.3清空故障节点的数据目录(在故障节点操作)

cd /opt/etcd/data

rm -rf *

5.9.4扩容集群

使用member add命令重新添加故障节点

/opt/etcd/bin/etcdctl member add k8s-master03 --peer-urls=https://192.168.21.203:2380 --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"

5.9.5在故障节点上修改配置文件参数 initial-cluster-state: "existing" 并启动etcd

vi /opt/etcd/cfg/etcd.yaml

initial-cluster-state: "existing"

:wq! #保存退出

systemctl start etcd #启动

5.9.6查询集群状态

/opt/etcd/bin/etcdctl endpoint status --cacert=/opt/etcd/ssl/etcdca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://192.168.21.201:2379,https://192.168.21.202:2379,https://192.168.21.203:2379"

5.9.7修改集群所有节点的配置参数

新建集群的时候,initial-cluster-state参数这个值为new,当集群正常运行时,所有节点的 initial-cluster-state参数应该设置为"existing",可以让节点在重启时直接加入现有集群,防止节点重启后又重新初始化集群,导致集群不一致,

vi /opt/etcd/cfg/etcd.yaml

initial-cluster-state: "existing"

:wq! #保存退出

systemctl reload-daemon

systemctl restart etcd

至此,k8s集群搭建之部署Etcd集群完成。

     
» 转载请注明来源:系统运维 » k8s集群搭建之部署Etcd集群

  系统运维技术交流QQ群:①185473046 系统运维技术交流□Ⅰ ②190706903 系统运维技术交流™Ⅱ ③203744115 系统运维技术交流™Ⅲ

给我留言

您必须 [ 登录 ] 才能发表留言!



Copyright© 2011-2025 系统运维 All rights reserved
版权声明:本站所有文章均为作者原创内容,如需转载,请注明出处及原文链接
陕ICP备11001040号-3