简述
对于生产环境来说,单节点master风险太大了。 非常有必要做一个高可用的集群,这里的高可用主要是针对控制面板来说的,比如 kube-apiserver、etcd、kube-comtroller-manager、kube-scheduler 这几个组件,其中 kube-controller-manager 与 kube-scheduler 组件是 kubernetes 集群自己去实现的高可用,apiserver 和 etcd 就需要手动去搭建高可集群了。 高可用的架构有很多,比如典型的 haproxy+keepalived架构,或者使用 nginx来做代理实现。我们这里为了声明如何将单 master 升级为高可用的集群,采用相对就更简单的 nginx 模式,当然这种模式也有一些缺点,但是足以说明高可用的实现方式了。
从上图可以看出,我们需要在所有控制节点上安装 nginx、keepalived服务。 来代理 apiserver,这里我准备了2个节点作为控制平面节点(线上一般最少3个节点): master、master2, 这里我默认所有节点都已经正常安装配置好了docker, 以及节点初始化操作。(由于机器资源有限,所以这里就只用1台nginx,不装keepalived服务了)
主机名 |
ip |
master |
192.168.116.20 |
master1 |
192.168.116.21 |
nginx |
192.168.116.25 |
node1 |
192.168.116.30 |
具体操作步骤
更新证书
由于我们需要将集群替换成高可用的集群,那么势必会想到我们会用一个负载均衡器来代理 APIServer, 也就是这个负载均衡器访问 APIServer 的时候需要能正常访问,所以默认安装的 APIServer 证书就需要更新,因为里面没有包含我们需要的地址,需要保证在 SAN 列表中包含一些额外的名称。
首先我们一个 kubeadn 的配置文件,如果一开始安装集群的时候你就是使用的配置文件,那么我们可以直接更新这个配置文件,但是我们你没有使用这个配置文件,直接使用 kubeadm init 来安装的集群,那么我们可以从集群中获取 kubeadm 的配置信息来插件一个配置文件,因为 kubeadm 会将其配置写入到 kube-system 命名空间下面的名为 kubeadm-config 的 ConfigMap 中。可以直接执行如下所示的命令将该配置导出:会生成一个 kubeadm.yam的配置文件
1
2
|
[root@master ~]# kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' > kubeadm.yaml
|
生成的kubeadm.yaml 文件中并没有列出额外的 SAN 信息,我们需要添加一个新的数据,需要在 apiserver 属性下面添加一个 certsSANs 的列表。如果你在启动集群的时候就使用的 kubeadm 的配置文件,可能就已经包含 certsSANs 列表了,如果没有我们就需要添加它,比如我们这里要添加一个新的域名 api.k8s,local 以及 master 和 master2 这两个主机名和IP地址 192.168.116.20、192.168.116.21、192.168.116.25。可以添加多个IP,192.168.116.25为虚拟VIP,那么我们需要在 apiServer 下面添加如下的所示的数据:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
[root@master ~]# cat kubeadm.yaml
apiServer:
certSANs:
- api.k8s.local
- master
- master1
- 192.168.116.20
- 192.168.116.21
- 192.168.116.25
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.16.9
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
|
上面我只列出了 apiServer 下面新增的 certSANs 信息,这些信息是包括在标准的 SAN 列表之外的,所以不用担心这里没有添加 kubernetes、kubernetes.default 等等这些信息,因为这些都是标准的 SAN 列表中的。
更新完 kubeadm 配置文件后我们就可以更新证书了,首先我们移动现有的 APIServer 的证书和密钥,因为 kubeadm 检测到他们已经存在于指定的位置,它就不会创建新的了。
1
2
3
4
5
6
7
8
|
备份
[root@master ~]# mv /etc/kubernetes/pki/apiserver.{crt,key} .
生成新的证书
[root@master ~]# kubeadm init phase certs apiserver --config kubeadm.yaml
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local api.k8s.local master master1] and IPs [10.96.0.1 192.168.116.20 192.168.116.20 192.168.116.21 192.168.116.25]
|
通过上面的命令可以查看到 APIServer 签名的 DNS 和 IP 地址信息,一定要和自己的目标签名信息进行对比,如果缺失了数据就需要在上面的 certSANs 中补齐,重新生成证书。
该命令会使用上面指定的 kubeadm 配置文件为 APIServer 生成一个新的证书和密钥,由于指定的配置文件中包含了 certSANs 列表,那么 kubeadm 会在创建新证书的时候自动添加这些 SANs。
最后一步是重启 APIServer 来接收新的证书,最简单的方法是直接杀死 APIServer 的容器:
1
|
[root@master ~]# docker restart `docker ps | grep kube-apiserver | grep -v pause | awk '{print $1}'`
|
验证证书
要验证证书是否更新我们可以直接去编辑 kubeconfig 文件中的 APIServer 地址,将其更换为新添加的 IP 地址或者主机名,然后去使用 kubectl 操作集群,查看是否可以正常工作。
当然我们可以使用 openssl 命令去查看生成的证书信息是否包含我们新添加的 SAN 列表数据:
1
2
3
4
5
|
[root@master ~]# openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text
...
DNS:master, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:api.k8s.local, DNS:master, DNS:master1, IP Address:10.96.0.1, IP Address:192.168.116.20, IP Address:192.168.116.20, IP Address:192.168.116.21, IP Address:192.168.116.25
...
|
如果上面的操作都一切顺利,最后一步是将上面的集群配置信息保存到集群的 kubeadm-config 这个 ConfigMap 中去,这一点非常重要,这样以后当我们使用 kubeadm 来操作集群的时候,相关的数据不会丢失,比如升级的时候还是会带上 certSANs 中的数据进行签名的。
1
2
3
4
5
6
|
[root@master ~]# kubeadm config upload from-file --config kubeadm.yaml
Command "from-file" is deprecated, please see kubeadm init phase upload-config
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
# 如果上面命令报错,可以直接编辑修改 添加需要的内容即可
[root@master ~]# kubectl -n kube-system edit configmap kubeadm-config
|
使用上面的命令保存配置后,我们同样可以用下面的命令来验证是否保存成功了:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
[root@master ~]# kubectl -n kube-system get configmap kubeadm-config -o yaml
apiVersion: v1
data:
ClusterConfiguration: |
apiServer:
certSANs:
- api.k8s.local
- master
- master1
- 192.168.116.20
- 192.168.116.21
- 192.168.116.25
...
|
更新 APIServer 证书的名称在很多场景下都会使用到,比如在控制平面前面添加一个负载均衡器,或者添加新的 DNS 名称或 IP 地址来使用控制平面的端点,所以掌握更新集群证书的方法也是非常有必要的。
部署nginx
Kubernetes作为容器集群系统,通过健康检查+重启策略实现了Pod故障自我修复能力,通过调度算法实现将Pod分布式部署,并保持预期副本数,根据Node失效状态自动在其他Node拉起Pod,实现了应用层的高可用性。
针对Kubernetes集群,高可用性还应包含以下两个层面的考虑:Etcd数据库的高可用性和Kubernetes Master组件的高可用性。 而kubeadm搭建的K8s集群,Etcd只起了一个,存在单点,所以我们这里会独立搭建一个Etcd集群。
Master节点扮演着总控中心的角色,通过不断与工作节点上的Kubelet和kube-proxy进行通信来维护整个集群的健康工作状态。如果Master节点故障,将无法使用kubectl工具或者API做任何集群管理。
Master节点主要有三个服务kube-apiserver、kube-controller-manager和kube-scheduler,其中kube-controller-manager和kube-scheduler组件自身通过选择机制已经实现了高可用,所以Master高可用主要针对kube-apiserver组件,而该组件是以HTTP API提供服务,因此对他高可用与Web服务器类似,增加负载均衡器对其负载均衡即可,并且可水平扩容。
kube-apiserver高可用架构图:
- Nginx是一个主流Web服务和反向代理服务器,这里用四层实现对apiserver实现负载均衡。
- Keepalived是一个主流高可用软件,基于VIP绑定实现服务器双机热备,在上述拓扑中,Keepalived主要根据Nginx运行状态判断是否需要故障转移(偏移VIP),例如当Nginx主节点挂掉,VIP会自动绑定在Nginx备节点,从而保证VIP一直可用,实现Nginx高可用。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
|
为了节省机器,我这里就只用1台nginx
[root@nginx ~]# yum -y install gcc pcre make openssl-devel wget
[root@nginx ~]# wget http://nginx.org/download/nginx-1.20.1.tar.gz
[root@nginx ~]# tar -xvf nginx-1.20.1.tar.gz
[root@nginx ~]# groupadd gnginx
[root@nginx ~]# useradd -g gnginx -s /sbin/nologin nginx
[root@nginx nginx-1.20.1]# ./configure --prefix=/usr/local/nginx --user=nginx --group=nginx --with-http_stub_status_module --with-http_ssl_module --with-http_realip_module --with-http_sub_module --with-http_flv_module --with-http_mp4_module --with-http_random_index_module --with-stream
[root@nginx nginx-1.20.1]# make && make install
配置
[root@nginx ~]# egrep -v '^#|^$' /usr/local/nginx/conf/nginx.conf
user nginx;
worker_processes auto;
error_log /usr/local/nginx/logs/error.log;
pid /run/nginx.pid;
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 1024;
}
stream {
log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';
access_log /usr/local/nginx/logs/k8s-access.log main;
upstream k8s-apiserver {
server 192.168.116.20:6443; # Master1 APISERVER IP:PORT
server 192.168.116.21:6443; # Master2 APISERVER IP:PORT
}
server {
listen 6443;
proxy_pass k8s-apiserver;
}
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /usr/local/nginx/logs/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 4096;
include /usr/local/nginx/conf/mime.types;
default_type application/octet-stream;
# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /etc/nginx/conf.d/*.conf;
server {
listen 80;
listen [::]:80;
server_name _;
root /usr/share/nginx/html;
# Load configuration files for the default server block.
include /etc/nginx/default.d/*.conf;
error_page 404 /404.html;
location = /404.html {
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
}
}
}
启动
[root@nginx nginx-1.20.1]# /usr/local/nginx/sbin/nginx
|
更改配置
启动成功后 apiserver 的负载均衡地址就成了 https://192.168.116.25:6443
。然后我们将 kubeconfig 文件中的 apiserver 地址替换成负载均衡器的地址。
kubelet.conf
1
2
3
4
5
6
7
|
[root@master ~]# cat /etc/kubernetes/kubelet.conf
...
server: https://192.168.116.25:6443
name: kubernetes
...
systemctl restart kubelet
|
controller-manager.conf
1
2
3
4
5
6
7
|
[root@master ~]# cat /etc/kubernetes/controller-manager.conf
...
server: https://192.168.116.25:6443
name: kubernetes
...
[root@master ~]# docker restart `docker ps | grep kube-controller-manager | grep -v pause | awk '{print $1}'`
|
scheduler.conf
1
2
3
4
5
6
7
|
[root@master ~]# cat /etc/kubernetes/scheduler.conf
...
server: https://192.168.116.25:6443
name: kubernetes
...
[root@master ~]# docker restart `docker ps | grep kube-scheduler | grep -v pause | awk '{print $1}'`
|
更新kube-proxy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
[root@master ~]# kubectl edit cm kube-proxy -n kube-system
...
kubeconfig.conf: |-
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://192.168.116.25:6443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default
name: default
...
|
当然还有 kubectl 访问集群的 ~/.kube/config
文件也需要修改。
1
2
3
4
5
|
[root@master ~]# cat .kube/config
...
server: https://192.168.116.25:6443
name: kubernetes
...
|
更新控制面板
由于我们现在已经在控制平面的前面添加了一个负载平衡器,因此我们需要使用正确的信息更新此 ConfigMap。(您很快就会将控制平面节点添加到集群中,因此在此ConfigMap中拥有正确的信息很重要。)
首先,使用以下命令从 ConfigMap 中获取当前配置:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
|
[root@master ~]# kubectl -n kube-system edit configmap kubeadm-config
apiVersion: v1
data:
ClusterConfiguration: |
apiServer:
certSANs:
- api.k8s.local
- master
- master1
- 192.168.116.20
- 192.168.116.21
- 192.168.116.25
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 192.168.116.25:6443 #需要添加的配置
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.16.9
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
ClusterStatus: |
apiEndpoints:
master:
advertiseAddress: 192.168.116.20
bindPort: 6443
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterStatus
kind: ConfigMap
metadata:
creationTimestamp: "2020-09-29T15:35:35Z"
name: kubeadm-config
namespace: kube-system
resourceVersion: "148258"
selfLink: /api/v1/namespaces/kube-system/configmaps/kubeadm-config
uid: 7c04d48f-af5b-4dab-ad74-eb2c06c2510b
|
然后需要在 kube-public
命名空间中更新 cluster-info
这个 ConfigMap,该命名空间包含一个Kubeconfig 文件,该文件的 server:
一行指向单个控制平面节点。只需使用kubectl -n kube-public edit cm cluster-info
更新该 server:
行以指向控制平面的负载均衡器即可。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
[root@master ~]# kubectl edit cm cluster-info -n kube-public
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
kubeconfig: |
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01Ea3lPVEUxTXpVeE5Gb1hEVE13TURreU56RTFNelV4TkZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBS0ZyCjVJWmV1ZVVyZlJxUmhXQzU0Q1o1ZTczUDl2MElXQ1YvWHd2c0FhWXo4SmtVSFNjbHZoNWZCMGRFbkhFNzR6ZXQKcWYyamNmeWNWdHlyZlJCQ1lsczlWMUkydzMvcHRvbExlUUVwYm9xbFZ3akQvYUVhZk9vdStrZ3loZjc0QVZ6Mwpic2ttQ09rc1h3TDFYaGVpUzh0VU1jSEZ0OURqU1VsOVN6eXVKbWhRajBHU29GbGg3NVAvSEo1VTVXdlJBenBYCk5sQ24rYzhxMkduK3BJWG9SK1hOQWF0TFRyZzBSYXBJeDE4US81cUJtQk4zaXJ6ZnN3WUd0cFBWTTZsMmtGSGgKeFpsS0NJRjJxaHpwenRLN1BsL3htb2dZaWR4bTVTS0dSWjJia3RFRUsreitYekd5cGJjc3QyZUl5S3hDMTBBWQpRdmpGaENjd2pMdXFtdzVqWTVzQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFIbSt5d2hrWXc1RlhlRFlhVWoySDhqY1kvNlMKeWhmejZuRUw4bENjV2w3TjM2ck1vc3ZWTTBocCs3Wnp6K2RkS1R4a1hjclZ1OURsVE9uME1mQ2ZiZi9ZOVNBbwpsNlhOQUxTcFBmSE5XdkVRQTFEbUR4MVVWdVgrZXA0OTRiNEszVkVpQ0pJN3pBZDVpdHRVc0dWaWZ4ZzBxT2dPCjQvYTVpTHViTlR3a1BsN01adU4yYW1QYXBUNEtkb25wVWNYTTk2eVRUNEF2bkRUM1ZJRGI3eHA3Yld2UmRIcUkKUmdwYlBldGY3TmdobWUvK3hKbzkzU2tmUE5seVNFNmhxQkJFMGNwVGlJMzdlT1ZzQzBDZmRQZ00rMjhidW56WAptZkV5U2JHVHhNMFRGNitCSmxOMCt6YjZhYjgvVTIyWXlFWjlLcXhjcWFKR1g0V3NnY1hFaVR0dEY5cz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
server: https://192.168.116.25:6443
name: ""
contexts: []
current-context: ""
kind: Config
preferences: {}
users: []
kind: ConfigMap
metadata:
creationTimestamp: "2020-09-29T15:35:36Z"
name: cluster-info
namespace: kube-public
resourceVersion: "124903"
selfLink: /api/v1/namespaces/kube-public/configmaps/cluster-info
uid: f5b74218-84d7-4de1-ba0e-1ef3cabbe65d
|
更新完成就可以看到 cluster-info 的信息变成了负载均衡器的地址了。
添加控制平面
接下来我们来添加额外的控制平面节点,首先使用如下命令来将集群的证书上传到集群中,供其他控制节点使用:
1
2
3
4
5
6
|
[root@master ~]# kubeadm init phase upload-certs --upload-certs
W0824 16:44:59.056367 32171 version.go:101] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get https://dl.k8s.io/release/stable-1.txt: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
W0824 16:44:59.103275 32171 version.go:102] falling back to the local client version: v1.16.9
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
450964c0a914353ca1c9e5d0bc2ac27395db2f39abf2381d1bd1d80fb4b362f4
|
上面的命令会生成一个新的证书密钥,但是只有2小时有效期。由于我们现有的集群已经运行一段时间了,所以之前的启动 Token 也已经失效了(Token 的默认生存期为24小时),所以我们也需要创建一个新的 Token 来添加新的控制平面节点:
1
2
|
[root@master ~]# kubeadm token create --print-join-command --config kubeadm.yaml
kubeadm join 192.168.116.25:6443 --token r69sza.bszdok47uwbidske --discovery-token-ca-cert-hash sha256:8fee3d9ed90a4496bafeadfbcea7b33f4b1af0c019a0d053cb64f10b7976e3f3
|
在需要加入的master1上执行
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
|
[root@master2 ~]# kubeadm join 192.168.116.25:6443 --token r69sza.bszdok47uwbidske --discovery-token-ca-cert-hash sha256:8fee3d9ed90a4496bafeadfbcea7b33f4b1af0c019a0d053cb64f10b7976e3f3 --control-plane --certificate-key 450964c0a914353ca1c9e5d0bc2ac27395db2f39abf2381d1bd1d80fb4b362f4
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.13. Latest validated version: 18.09
[WARNING Hostname]: hostname "master1" could not be reached
[WARNING Hostname]: hostname "master1": lookup master1 on 114.114.114.114:53: no such host
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local api.k8s.local master master1] and IPs [10.96.0.1 192.168.116.21 192.168.116.25 192.168.116.20 192.168.116.21 192.168.116.25]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master1 localhost] and IPs [192.168.116.21 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master1 localhost] and IPs [192.168.116.21 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2021-08-24T16:52:50.553+0800","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.116.21:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node master1 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
|
查看etcd
1
2
3
4
5
6
7
|
[root@master2 ~]# cat /etc/kubernetes/manifests/etcd.yaml
apiVersion: v1
...
- --initial-cluster=master1=https://192.168.116.21:2380,master=https://192.168.116.20:2380
- --initial-cluster-state=existing
...
|
查看集群是否正常
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 328d v1.16.9
master1 Ready master 8m10s v1.16.9
node1 Ready <none> 328d v1.16.9
[root@master ~]# kubectl get cs
NAME AGE
scheduler <unknown>
controller-manager <unknown>
etcd-0 <unknown>
原因: 这是个 kubectl 的 bug, 跟版本相关,kubernetes 有意废除 get cs 命令
解决: 目前对集群的运行无影响, 可通过加 -o yaml 查看状态
|