K8S的pod异常。

1,首先查看pod状态

kubectl get pods -o wide 

执行命令返回的结果为:

[root@cluster-master manifests]# kubectl get pods -o wide 
NAME                     READY   STATUS              RESTARTS   AGE     IP                NODE             NOMINATED NODE   READINESS GATES
my-nginx                 1/1     Running             0          2d18h   192.169.166.130   node1            <none>           <none>
my-pod                   0/2     Terminating         0          62m     <none>            node1            <none>           <none>
my-pod-cluster-master    0/2     ContainerCreating   0          63m     <none>            cluster-master   <none>           <none>
my-pod2-cluster-master   0/2     ContainerCreating   0          14m     <none>            cluster-master   <none>           <none>

在这里插入图片描述
发现my-pod的状态是Terminating。

2 ,查看具体故障原因

kubectl describe pod my-pod

执行命令返回的结果为:

[root@cluster-master manifests]# kubectl describe pod my-pod
Name:                      my-pod
Namespace:                 default
...
...
Events:
  Type     Reason          Age                   From     Message
  ----     ------          ----                  ----     -------
  Normal   SandboxChanged  24m (x186 over 64m)   kubelet  Pod sandbox changed, it will be killed and re-created.
  Warning  FailedKillPod   4m26s (x86 over 23m)  kubelet  error killing pod: failed to "KillPodSandbox" for "a937a971-070f-4ff4-b547-503d0843efb7" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \"my-pod_default\" network: plugin type=\"calico\" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized"

可以看到一个具体报错原因,error killing pod: failed to “KillPodSandbox” for “a937a971-070f-4ff4-b547-503d0843efb7” with KillPodSandboxError: “rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod “my-pod_default” network: plugin type=“calico” failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized”。

3,修改文件

vim /etc/containerd/config.toml

里面有个 sandbox_image的参数项要修改成国内的,如图所示,我的是
sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6"

在这里插入图片描述

如果你的已经是国内的源了,就不用修改了。

4,重启calico

因为我的K8S集群各个node、pod互相通信是利用calico。
在kube-system 命名空间中重启名为 calico-node 的 DaemonSet 。
这通常用于应用对 calico-node 的配置更改或解决可能出现的问题。

kubectl rollout restart ds -n kube-system calico-node

5 检查一下

kubectl get pods -o wide 

执行命令返回的结果为:

[root@cluster-master manifests]# kubectl get pods
NAME                     READY   STATUS    RESTARTS   AGE
my-nginx                 1/1     Running   0          2d18h
my-pod2-cluster-master   2/2     Running   0          34m

在这里插入图片描述

Logo

技术共进,成长同行——讯飞AI开发者社区

更多推荐