[错误解决]centos中使用kubeadm方式搭建多master的高可用K8S集群

安装步骤

部署Kubernetes Master时的错误

部署Kubernetes Master时,创建了一个kubeadm-config.yaml文件,将相关配置信息放到这个地方,该文件如下:

apiServer:
  certSANs:
    - master1
    - master2
    - master.k8s.io
    - 192.168.44.158   #虚拟ip
    - 192.168.44.155   #master1的ip
    - 192.168.44.156   #master2的ip
    - 127.0.0.1
  extraArgs:
    authorization-mode: Node,RBAC
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "master.k8s.io:16443"
controllerManager: {}
dns: 
  type: CoreDNS
etcd:
  local:    
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.16.3   #
networking: 
  dnsDomain: cluster.local  
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.1.0.0/16
scheduler: {}

错误提示

刚开始只是修改了部分ip地址,然后在具有vip的master节点上直接运行:

kubeadm init --config kubeadm-config.yaml

W0118 22:48:08.547424 13575 common.go:77] your configuration file uses a deprecated API spec: “kubeadm.k8s.io/v1beta1”. Please use ‘kubeadm config migrate --old-config old.yaml --new-config new.yaml’, which will write the new, similar spec using a newer API version.
this version of kubeadm only supports deploying clusters with the control plane version >= 1.17.0. Current version: v1.16.3

在这里插入图片描述

解决办法

将该yaml文件的v1beta1改为v1beta2
将该yaml文件的v1.16.3改为v1.18.0,因为自己装的kubernetes就是v1.18.0
(本来改的1.18.0,后来想起来前两天装单master的时候用的是1.20.1,就把这个又改成了1.20.1)

重新运行:

kubeadm reset && kubeadm init --config kubeadm-config.yaml

错误提示:

[ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
[ERROR Swap]: running with swap on is not supported. Please disable swap

[错误解决]centos中使用kubeadm方式搭建多master的高可用K8S集群_第1张图片
针对于第一个错误:将虚拟机的cpu改为两个核心=即可
针对于第二个错误:加上参数 --ignore-preflight-errors=Swap即可

再次运行:

kubeadm reset && kubeadm init --config kubeadm-config.yaml --ignore-preflight-errors=Swap

错误提示:

[kubelet-check] It seems like the kubelet isn’t running or healthy.
[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.

[root@master1 manifests]# kubeadm reset && kubeadm init --config kubeadm-config.yaml --ignore-preflight-errors=Swap
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
W0118 23:28:22.444442   17486 reset.go:99] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get https://master.k8s.io:16443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s: dial tcp: lookup master.k8s.io on 192.168.11.2:53: no such host
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0118 23:28:23.804847   17486 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
W0118 23:28:23.942917   17501 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.20.1
[preflight] Running pre-flight checks
	[WARNING KubernetesVersion]: Kubernetes version is greater than kubeadm version. Please consider to upgrade kubeadm. Kubernetes version: 1.20.1. Kubeadm version: 1.18.x
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
	[WARNING Swap]: running with swap on is not supported. Please disable swap
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0118 23:28:24.336280   17501 images.go:81] could not find officially supported version of etcd for Kubernetes v1.20.1, falling back to the nearest etcd version (3.4.3-0)
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master.k8s.io master1 master2 master.k8s.io] and IPs [10.1.0.1 192.168.11.168 192.168.11.167 192.168.11.168 192.168.11.169 127.0.0.1]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master1 localhost] and IPs [192.168.11.168 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master1 localhost] and IPs [192.168.11.168 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
W0118 23:29:13.392179   17501 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0118 23:29:13.400616   17501 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0118 23:29:13.402251   17501 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
W0118 23:29:13.403529   17501 images.go:81] could not find officially supported version of etcd for Kubernetes v1.20.1, falling back to the nearest etcd version (3.4.3-0)
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.

	Unfortunately, an error has occurred:
		timed out waiting for the condition

	This error is likely caused by:
		- The kubelet is not running
		- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

	If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
		- 'systemctl status kubelet'
		- 'journalctl -xeu kubelet'

	Additionally, a control plane component may have crashed or exited when started by the container runtime.
	To troubleshoot, list all containers using your preferred container runtimes CLI.

	Here is one example how you may list all Kubernetes containers running in docker:
		- 'docker ps -a | grep kube | grep -v pause'
		Once you have found the failing container, you can inspect its logs with:
		- 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
[root@master1 manifests]# 

尝试1

[root@master1 kubelet.service.d]# cd /etc/systemd/system/kubelet.service.d/
bash: cd: /etc/systemd/system/kubelet.service.d/: 没有那个文件或目录
[root@master1 kubelet.service.d]# 
[root@master1 kubelet.service.d]# 
[root@master1 kubelet.service.d]# mkdir -p /etc/systemd/system/kubelet.service.d/
[root@master1 kubelet.service.d]# cd /etc/systemd/system/kubelet.service.d/
[root@master1 kubelet.service.d]# ll
总用量 0
[root@master1 kubelet.service.d]# 
[root@master1 kubelet.service.d]# cd /usr/lib/systemd/system/kubelet.service.d
[root@master1 kubelet.service.d]# cp 10-kubeadm.conf /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
[root@master1 kubelet.service.d]# systemctl daemon-reload
[root@master1 kubelet.service.d]# systemctl restart kubelet

[错误解决]centos中使用kubeadm方式搭建多master的高可用K8S集群_第2张图片
不行,还是报同样的错误

尝试2

看到说kubelet的状态可能有问题
查看状态显示:
1月 18 23:47:55 master1 systemd[1]: Unit kubelet.service entered failed state.
1月 18 23:47:55 master1 systemd[1]: kubelet.service failed.

输入:

swapoff -a

[错误解决]centos中使用kubeadm方式搭建多master的高可用K8S集群_第3张图片
kubelet状态正常,但是运行还是不行,而且运行完之后kubelet的状态又回到了之前
不过重新运行:

[root@master1 kubelet.service.d]# systemctl daemon-reload

kubelet的状态就正常了

尝试3:改/etc/hosts和kubeadm-config.yaml文件

之前的hosts文件:
[错误解决]centos中使用kubeadm方式搭建多master的高可用K8S集群_第4张图片
改过的hosts文件:
[错误解决]centos中使用kubeadm方式搭建多master的高可用K8S集群_第5张图片
改过的yaml文件:
[错误解决]centos中使用kubeadm方式搭建多master的高可用K8S集群_第6张图片
之所以想到是这个问题,是因为自己在看kubelet的状态的时候,显示
master1找不到,自己就想着应该是hosts文件的问题
而且kubeadm-config.yaml中那几个certSANs中那几个值我就一直有点疑惑,就感觉这个master.k8s.io有点奇怪,回到hosts文件中看了一下,确实没配
[错误解决]centos中使用kubeadm方式搭建多master的高可用K8S集群_第7张图片
修改之后,重新运行:

[root@master1 ningan]# cd /usr/local/kubernetes/manifests/
[root@master1 manifests]# kubeadm reset && kubeadm init --config kubeadm-config.yaml --ignore-preflight-errors=Swap

[错误解决]centos中使用kubeadm方式搭建多master的高可用K8S集群_第8张图片

你可能感兴趣的:(云计算,kubernetes,高可用集群,centos)