OCP的operator——(4)用户任务:使用Operator创建etcd集群

文章目录

  • 环境
  • 在namespace中安装Operator
    • 先决条件
    • 使用Web console从OperatorHub安装
    • 删除
  • 使用CLI从OperatorHub安装
  • 从已安装的Operator创建应用
    • 使用Operator创建etcd集群
    • 报错
      • 从web console debug
      • 从命令行debug
      • 分析
  • 参考

环境

  • RHEL 9.3
  • Red Hat OpenShift Local 2.32

在namespace中安装Operator

先决条件

打开web console:

$ crc console
Opening the OpenShift Web Console in the default browser...

会自动打开浏览,访问web console:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第1张图片
注:如果不想自动打开浏览器,则可以加上 --url 选项:

$ crc console --url
https://console-openshift-console.apps-crc.testing

然后手工复制URL,并从浏览器里访问。

查看用户名密码:

$ crc console --credentials
To login as a regular user, run 'oc login -u developer -p developer https://api.crc.testing:6443'.
To login as an admin, run 'oc login -u kubeadmin -p cWwas-FvXBW-rTjsi-eECwX https://api.crc.testing:6443'

使用 developer 登录web console:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第2张图片

切换到“Administrator”视角,可见,在“Operators”下,只有“Installed Operators”(当前没有安装Operator)。

这是因为 developer 用户没有安装Operator的权限,需要赋权才行。

简略起见,就不赋权了。退出登录,然后使用 kubeadmin 帐号登录:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第3张图片

可见,在“Operators”下多了“OperatorHub”子菜单,并且“Installed Operators”里显示了“Package Server”。

使用Web console从OperatorHub安装

点击“OperatorHub”,搜索“etcd”:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第4张图片
点击搜索结果里的“etcd”,弹出对话框,如下:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第5张图片

点击左上角“Install”按钮,如下:

  • Update channel:默认为 singlenamespace-alpha
  • Version:默认为 0.9.4
  • Installation mode:默认为 A specific namespace on the cluster
  • Installed namespace:选择 my-etcd ,这是自己创建的project(namespace),若还没有创建,则创建一下
  • Update approval:默认为 Automatic

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第6张图片

最后,点击“Install”按钮,安装Operator。

大约一两分钟,就安装好了:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第7张图片

回到“Installed Operators”,可见在 my-etec project下,出现了etcd Operator:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第8张图片

点击可以查看详情:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第9张图片

注:在命令行,可查看相应的CSV:

$ oc get csv -n my-etcd
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Succeeded

删除

点击右侧三个点,选择“Uninstall Operator”:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第10张图片

使用CLI从OperatorHub安装

查看当前身份:

$ oc whoami
kubeadmin

查看OperatorHub中集群可用的Operator列表:

$ oc get packagemanifests -n openshift-marketplace
NAME                                               CATALOG               AGE
forklift-operator                                  Community Operators   34d
debezium-operator                                  Community Operators   34d
pcc-operator                                       Certified Operators   34d
......
etcd                                               Community Operators   34d
......

检查所需Operator,以验证其支持的安装模式和可用频道:

$ oc describe packagemanifests etcd -n openshift-marketplace
Name:         etcd
Namespace:    openshift-marketplace
Labels:       catalog=community-operators
              catalog-namespace=openshift-marketplace
              operatorframework.io/arch.amd64=supported
              operatorframework.io/os.linux=supported
              provider=CNCF
              provider-url=
Annotations:  
API Version:  packages.operators.coreos.com/v1
Kind:         PackageManifest
Metadata:
  Creation Timestamp:  2024-01-10T10:34:13Z
Spec:
Status:
  Catalog Source:               community-operators
  Catalog Source Display Name:  Community Operators
  Catalog Source Namespace:     openshift-marketplace
  Catalog Source Publisher:     Red Hat
  Channels:
    Current CSV:  etcdoperator.v0.6.1
    Current CSV Desc:
      Annotations:
        Capabilities:           Full Lifecycle
        Description:            etcd is a distributed key value store providing a reliable way to store data across a cluster of machines.
        Tectonic - Visibility:  ocs
      Apiservicedefinitions:
      Customresourcedefinitions:
        Owned:
          Description:   Represents a cluster of etcd nodes.
          Display Name:  etcd Cluster
          Kind:          EtcdCluster
          Name:          etcdclusters.etcd.database.coreos.com
          Version:       v1beta2
      Description:       etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles leader elections during network partitions and will tolerate machine failure, including the leader. Your applications can read and write data into etcd.
A simple use-case is to store database connection details or feature flags within etcd as key value pairs. These values can be watched, allowing your app to reconfigure itself when they change. Advanced uses take advantage of the consistency guarantees to implement database leader elections or do distributed locking across a cluster of workers.

_The etcd Open Cloud Service is Public Alpha. The goal before Beta is to fully implement backup features._

### Reading and writing to etcd

Communicate with etcd though its command line utility `etcdctl` or with the API using the automatically generated Kubernetes Service.

[Read the complete guide to using the etcd Open Cloud Service](https://coreos.com/tectonic/docs/latest/alm/etcd-ocs.html)

### Supported Features
**High availability**
Multiple instances of etcd are networked together and secured. Individual failures or networking issues are transparently handled to keep your cluster up and running.
**Automated updates**
Rolling out a new etcd version works like all Kubernetes rolling updates. Simply declare the desired version, and the etcd service starts a safe rolling update to the new version automatically.
**Backups included**
Coming soon, the ability to schedule backups to happen on or off cluster.

      Display Name:  etcd
      Install Modes:
        Supported:  true
        Type:       OwnNamespace
        Supported:  true
        Type:       SingleNamespace
        Supported:  false
        Type:       MultiNamespace
        Supported:  true
        Type:       AllNamespaces
      Keywords:
        etcd
        key value
        database
        coreos
        open source
      Links:
        Name:  Blog
        URL:   https://coreos.com/etcd
        Name:  Documentation
        URL:   https://coreos.com/operators/etcd/docs/latest/
        Name:  etcd Operator Source Code
        URL:   https://github.com/coreos/etcd-operator
      Maintainers:
        Email:   support@coreos.com
        Name:    CoreOS, Inc
      Maturity:  alpha
      Provider:
        Name:  CoreOS, Inc
      Related Images:
        quay.io/coreos/etcd-operator@sha256:bd944a211eaf8f31da5e6d69e8541e7cada8f16a9f7a5a570b22478997819943
      Version:  0.6.1
    Entries:
      Name:       etcdoperator.v0.6.1
      Version:    0.6.1
    Name:         alpha
    Current CSV:  etcdoperator.v0.9.4-clusterwide
    Current CSV Desc:
      Annotations:
        Alm - Examples:  [
  {
    "apiVersion": "etcd.database.coreos.com/v1beta2",
    "kind": "EtcdCluster",
    "metadata": {
      "name": "example",
      "annotations": {
        "etcd.database.coreos.com/scope": "clusterwide"
      }
    },
    "spec": {
      "size": 3,
      "version": "3.2.13"
    }
  },
  {
    "apiVersion": "etcd.database.coreos.com/v1beta2",
    "kind": "EtcdRestore",
    "metadata": {
      "name": "example-etcd-cluster-restore"
    },
    "spec": {
      "etcdCluster": {
        "name": "example-etcd-cluster"
      },
      "backupStorageType": "S3",
      "s3": {
        "path": "",
        "awsSecret": ""
      }
    }
  },
  {
    "apiVersion": "etcd.database.coreos.com/v1beta2",
    "kind": "EtcdBackup",
    "metadata": {
      "name": "example-etcd-cluster-backup"
    },
    "spec": {
      "etcdEndpoints": [""],
      "storageType":"S3",
      "s3": {
        "path": "",
        "awsSecret": ""
      }
    }
  }
]

        Capabilities:           Full Lifecycle
        Categories:             Database
        Container Image:        quay.io/coreos/etcd-operator@sha256:66a37fd61a06a43969854ee6d3e21087a98b93838e284a6086b13917f96b0d9b
        Created At:             2019-02-28 01:03:00
        Description:            Create and maintain highly-available etcd clusters on Kubernetes
        Repository:             https://github.com/coreos/etcd-operator
        Tectonic - Visibility:  ocs
      Apiservicedefinitions:
      Customresourcedefinitions:
        Owned:
          Description:   Represents a cluster of etcd nodes.
          Display Name:  etcd Cluster
          Kind:          EtcdCluster
          Name:          etcdclusters.etcd.database.coreos.com
          Version:       v1beta2
          Description:   Represents the intent to backup an etcd cluster.
          Display Name:  etcd Backup
          Kind:          EtcdBackup
          Name:          etcdbackups.etcd.database.coreos.com
          Version:       v1beta2
          Description:   Represents the intent to restore an etcd cluster from a backup.
          Display Name:  etcd Restore
          Kind:          EtcdRestore
          Name:          etcdrestores.etcd.database.coreos.com
          Version:       v1beta2
      Description:       The etcd Operater creates and maintains highly-available etcd clusters on Kubernetes, allowing engineers to easily deploy and manage etcd clusters for their applications.

etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles leader elections during network partitions and will tolerate machine failure, including the leader.


### Reading and writing to etcd

Communicate with etcd though its command line utility `etcdctl` via port forwarding:

    $ kubectl --namespace default port-forward service/example-client 2379:2379
    $ etcdctl --endpoints http://127.0.0.1:2379 get /

Or directly to the API using the automatically generated Kubernetes Service:

    $ etcdctl --endpoints http://example-client.default.svc:2379 get /

Be sure to secure your etcd cluster (see Common Configurations) before exposing it outside of the namespace or cluster.


### Supported Features

* **High availability** - Multiple instances of etcd are networked together and secured. Individual failures or networking issues are transparently handled to keep your cluster up and running.

* **Automated updates** - Rolling out a new etcd version works like all Kubernetes rolling updates. Simply declare the desired version, and the etcd service starts a safe rolling update to the new version automatically.

* **Backups included** - Create etcd backups and restore them through the etcd Operator.

### Common Configurations

* **Configure TLS** - Specify [static TLS certs](https://github.com/coreos/etcd-operator/blob/master/doc/user/cluster_tls.md) as Kubernetes secrets.

* **Set Node Selector and Affinity** - [Spread your etcd Pods](https://github.com/coreos/etcd-operator/blob/master/doc/user/spec_examples.md#three-member-cluster-with-node-selector-and-anti-affinity-across-nodes) across Nodes and availability zones.

* **Set Resource Limits** - [Set the Kubernetes limit and request](https://github.com/coreos/etcd-operator/blob/master/doc/user/spec_examples.md#three-member-cluster-with-resource-requirement) values for your etcd Pods.

* **Customize Storage** - [Set a custom StorageClass](https://github.com/coreos/etcd-operator/blob/master/doc/user/spec_examples.md#custom-persistentvolumeclaim-definition) that you would like to use.

      Display Name:  etcd
      Install Modes:
        Supported:  true
        Type:       OwnNamespace
        Supported:  false
        Type:       SingleNamespace
        Supported:  false
        Type:       MultiNamespace
        Supported:  true
        Type:       AllNamespaces
      Keywords:
        etcd
        key value
        database
        coreos
        open source
      Links:
        Name:  Blog
        URL:   https://coreos.com/etcd
        Name:  Documentation
        URL:   https://coreos.com/operators/etcd/docs/latest/
        Name:  etcd Operator Source Code
        URL:   https://github.com/coreos/etcd-operator
      Maintainers:
        Email:   etcd-dev@googlegroups.com
        Name:    etcd Community
      Maturity:  alpha
      Provider:
        Name:  CNCF
      Related Images:
        quay.io/coreos/etcd-operator@sha256:66a37fd61a06a43969854ee6d3e21087a98b93838e284a6086b13917f96b0d9b
      Version:  0.9.4-clusterwide
    Entries:
      Name:       etcdoperator.v0.9.4-clusterwide
      Version:    0.9.4-clusterwide
      Name:       etcdoperator.v0.9.2-clusterwide
      Version:    0.9.2-clusterwide
      Name:       etcdoperator.v0.9.0
      Version:    0.9.0
    Name:         clusterwide-alpha
    Current CSV:  etcdoperator.v0.9.4
    Current CSV Desc:
      Annotations:
        Alm - Examples:  [
  {
    "apiVersion": "etcd.database.coreos.com/v1beta2",
    "kind": "EtcdCluster",
    "metadata": {
      "name": "example"
    },
    "spec": {
      "size": 3,
      "version": "3.2.13"
    }
  },
  {
    "apiVersion": "etcd.database.coreos.com/v1beta2",
    "kind": "EtcdRestore",
    "metadata": {
      "name": "example-etcd-cluster-restore"
    },
    "spec": {
      "etcdCluster": {
        "name": "example-etcd-cluster"
      },
      "backupStorageType": "S3",
      "s3": {
        "path": "",
        "awsSecret": ""
      }
    }
  },
  {
    "apiVersion": "etcd.database.coreos.com/v1beta2",
    "kind": "EtcdBackup",
    "metadata": {
      "name": "example-etcd-cluster-backup"
    },
    "spec": {
      "etcdEndpoints": [""],
      "storageType":"S3",
      "s3": {
        "path": "",
        "awsSecret": ""
      }
    }
  }
]

        Capabilities:           Full Lifecycle
        Categories:             Database
        Container Image:        quay.io/coreos/etcd-operator@sha256:66a37fd61a06a43969854ee6d3e21087a98b93838e284a6086b13917f96b0d9b
        Created At:             2019-02-28 01:03:00
        Description:            Create and maintain highly-available etcd clusters on Kubernetes
        Repository:             https://github.com/coreos/etcd-operator
        Tectonic - Visibility:  ocs
      Apiservicedefinitions:
      Customresourcedefinitions:
        Owned:
          Description:   Represents a cluster of etcd nodes.
          Display Name:  etcd Cluster
          Kind:          EtcdCluster
          Name:          etcdclusters.etcd.database.coreos.com
          Version:       v1beta2
          Description:   Represents the intent to backup an etcd cluster.
          Display Name:  etcd Backup
          Kind:          EtcdBackup
          Name:          etcdbackups.etcd.database.coreos.com
          Version:       v1beta2
          Description:   Represents the intent to restore an etcd cluster from a backup.
          Display Name:  etcd Restore
          Kind:          EtcdRestore
          Name:          etcdrestores.etcd.database.coreos.com
          Version:       v1beta2
      Description:       The etcd Operater creates and maintains highly-available etcd clusters on Kubernetes, allowing engineers to easily deploy and manage etcd clusters for their applications.

etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles leader elections during network partitions and will tolerate machine failure, including the leader.


### Reading and writing to etcd

Communicate with etcd though its command line utility `etcdctl` via port forwarding:

    $ kubectl --namespace default port-forward service/example-client 2379:2379
    $ etcdctl --endpoints http://127.0.0.1:2379 get /

Or directly to the API using the automatically generated Kubernetes Service:

    $ etcdctl --endpoints http://example-client.default.svc:2379 get /

Be sure to secure your etcd cluster (see Common Configurations) before exposing it outside of the namespace or cluster.


### Supported Features

* **High availability** - Multiple instances of etcd are networked together and secured. Individual failures or networking issues are transparently handled to keep your cluster up and running.

* **Automated updates** - Rolling out a new etcd version works like all Kubernetes rolling updates. Simply declare the desired version, and the etcd service starts a safe rolling update to the new version automatically.

* **Backups included** - Create etcd backups and restore them through the etcd Operator.

### Common Configurations

* **Configure TLS** - Specify [static TLS certs](https://github.com/coreos/etcd-operator/blob/master/doc/user/cluster_tls.md) as Kubernetes secrets.

* **Set Node Selector and Affinity** - [Spread your etcd Pods](https://github.com/coreos/etcd-operator/blob/master/doc/user/spec_examples.md#three-member-cluster-with-node-selector-and-anti-affinity-across-nodes) across Nodes and availability zones.

* **Set Resource Limits** - [Set the Kubernetes limit and request](https://github.com/coreos/etcd-operator/blob/master/doc/user/spec_examples.md#three-member-cluster-with-resource-requirement) values for your etcd Pods.

* **Customize Storage** - [Set a custom StorageClass](https://github.com/coreos/etcd-operator/blob/master/doc/user/spec_examples.md#custom-persistentvolumeclaim-definition) that you would like to use.

      Display Name:  etcd
      Install Modes:
        Supported:  true
        Type:       OwnNamespace
        Supported:  true
        Type:       SingleNamespace
        Supported:  false
        Type:       MultiNamespace
        Supported:  false
        Type:       AllNamespaces
      Keywords:
        etcd
        key value
        database
        coreos
        open source
      Links:
        Name:  Blog
        URL:   https://coreos.com/etcd
        Name:  Documentation
        URL:   https://coreos.com/operators/etcd/docs/latest/
        Name:  etcd Operator Source Code
        URL:   https://github.com/coreos/etcd-operator
      Maintainers:
        Email:   etcd-dev@googlegroups.com
        Name:    etcd Community
      Maturity:  alpha
      Provider:
        Name:  CNCF
      Related Images:
        quay.io/coreos/etcd-operator@sha256:66a37fd61a06a43969854ee6d3e21087a98b93838e284a6086b13917f96b0d9b
      Version:  0.9.4
    Entries:
      Name:         etcdoperator.v0.9.4
      Version:      0.9.4
      Name:         etcdoperator.v0.9.2
      Version:      0.9.2
      Name:         etcdoperator.v0.9.0
      Version:      0.9.0
    Name:           singlenamespace-alpha
  Default Channel:  singlenamespace-alpha
  Package Name:     etcd
  Provider:
    Name:  CNCF
Events:    

OperatorGroup 对象定义的Operator组,选择目标namespace,在其中为同一namespace中的所有Operator生成所需的RBAC访问权限。

订阅Operator的namespace必须具有与Operator的安装模式( AllNamespacesSingleNamespace 模式)相匹配的Operator组。如果要使用 AllNamespaces 安装Operator,则 openshift-operators namespace已有适当的Operator组。

不过,如果采用 SingleNamespace 模式,而还没有适当的Operator组,则必须创建一个。

注意:前面在使用web console时,选择 SingleNamespace 模式,则在后台自动创建了 OperatorGroupSubscription 对象。

创建 OperatorGroup 对象YAML文件,比如 operatorgroup.yaml

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: >
  namespace: >
spec:
  targetNamespaces:
  - >
  • my-operatorgroup
  • my-etcd

注意:OLM为每个Operator组创建以下集群角色:

  • -admin
  • -edit
  • -view

当手动创建Operator组时,必须指定一个唯一名称,该名称不能和现有集群角色或其它Operator组冲突。

创建 OperatorGroup 对象:

oc apply -f operatorgroup.yaml

创建一个 Subscription 对象YAML文件,为Operator订阅一个namespace,比如 sub.yaml

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: >
  # namespace: openshift-operators # 1
  namespace: my-etcd # 1
spec:
  channel: > # 2
  name: > # 3
  source: redhat-operators # 4
  sourceNamespace: openshift-marketplace # 5
  config:
    env: # 6
    - name: ARGS
      value: "-v=10"
    envFrom: # 7
    - secretRef:
        name: license-secret
    volumes: # 8
    - name: >
      configMap:
        name: >
    volumeMounts: # 9
    - mountPath: >
      name: >
    tolerations: # 10
    - operator: "Exists"
    resources: # 11
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
    nodeSelector: # 12
      foo: bar
  1. 对于默认的 AllNamespaces 安装模式用法,需指定 openshift-operators namespace。也可以指定一个自定义的全局namespace(如果创建了)。否则,需为 SingleNamespace 安装模式使用指定关联的单个namespace。

本例中,指定了 my-etcd namespace。

  • etcd
  • singlenamespace-alpha
  • etcd
  • my-volume
  • my-configmap
  • my-directory

如果集群为STS模式,在 Subscription 对象中包含以下字段:

kind: Subscription
# ...
spec:
  installPlanApproval: Manual # 1
  config:
    env:
    - name: ROLEARN
      value: "" # 2

创建 Subscription 对象:

oc apply -f sub.yaml

此时,OLM已能感知到所选的Operator。Operator的CSV应该已经出现在目标namespace中,由Operator所提供的API应该已经可用于创建。

注:文档说的不是很清楚,本例应该用不到ConfigMap、PVC、PV,这可能只是一个模板。我没有实际实验。

从已安装的Operator创建应用

使用Operator创建etcd集群

前面安装好etcd Operator后,在web console上点击etcd,查看详情。

Provided APIs 下,可见该Operator提供了三类新资源:

  • etcd Cluster
  • etcd Backup
  • etcd Restore

这些对象的工作方式与内建的原生Kubernetes对象(比如 DeploymentReplicaSet )相似,但包含管理etcd所特有的逻辑。

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第11张图片

点击“etcd Cluster”下方的“Create instance”,如下:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第12张图片

点击左下角“Create”按钮,如下:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第13张图片

报错

点击“EC example”(EC表示EtcdCluster),然后查看Resources页签:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第14张图片
可以看到pod处于pending状态。

(注:上图是一个clusterwide的Operator,因为我重新安装过Operater,不过问题都是一样的。)

从web console debug

点击“example-bcqztbd6l6”pod,然后查看“Logs”页签:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第15张图片

可以看到,“etcd”容器没有任何log(因为它处于“waiting”状态)。切换到“check-dns”容器,可以看到其log:

OCP的operator——(4)用户任务:使用Operator创建etcd集群_第16张图片

可以看到,在不断的产生重复的log:

......
nslookup: can't resolve 'example-bcqztbd6l6.example.my-etcd.svc'
Server: 10.217.4.10
Address 1: 10.217.4.10 dns-default.openshift-dns.svc.cluster.local
......

从命令行debug

在命令行查看pod:

$ oc describe pod example-bcqztbd6l6 -n my-etcd
Name:             example-bcqztbd6l6
Namespace:        my-etcd
......
Init Containers:
  check-dns:
    Container ID:  cri-o://8e4f03cfea06f682d877e6122ebd84f4b6f8ae75f87ba0fd3ebae1fabd36ebbe
    Image:         busybox:1.28.0-glibc
    Image ID:      docker.io/library/busybox@sha256:0b55a30394294ab23b9afd58fab94e61a923f5834fba7ddbae7f8e0c11ba85e6
    Port:          
    Host Port:     
    Command:
      /bin/sh
      -c
      
                TIMEOUT_READY=0
                while ( ! nslookup example-bcqztbd6l6.example.my-etcd.svc )
                do
                  # If TIMEOUT_READY is 0 we should never time out and exit 
                  TIMEOUT_READY=$(( TIMEOUT_READY-1 ))
                              if [ $TIMEOUT_READY -eq 0 ];
                                  then
                                      echo "Timed out waiting for DNS entry"
                                      exit 1
                                  fi
                              sleep 1
                            done
    State:          Running
      Started:      Wed, 14 Feb 2024 17:49:40 +0800
    Ready:          False
    Restart Count:  0
    Environment:    
    Mounts:         
Containers:
  etcd:
    Container ID:  
    Image:         quay.io/coreos/etcd:v3.2.13
    Image ID:      
    Ports:         2380/TCP, 2379/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /usr/local/bin/etcd
      --data-dir=/var/etcd/data
      --name=example-bcqztbd6l6
      --initial-advertise-peer-urls=http://example-bcqztbd6l6.example.my-etcd.svc:2380
      --listen-peer-urls=http://0.0.0.0:2380
      --listen-client-urls=http://0.0.0.0:2379
      --advertise-client-urls=http://example-bcqztbd6l6.example.my-etcd.svc:2379
      --initial-cluster=example-bcqztbd6l6=http://example-bcqztbd6l6.example.my-etcd.svc:2380
      --initial-cluster-state=new
      --initial-cluster-token=a8f3fa08-f114-4c12-95b7-60e14eea400c
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       exec [/bin/sh -ec ETCDCTL_API=3 etcdctl endpoint status] delay=10s timeout=10s period=60s #success=1 #failure=3
    Readiness:      exec [/bin/sh -ec ETCDCTL_API=3 etcdctl endpoint status] delay=1s timeout=5s period=5s #success=1 #failure=3
    Environment:    
    Mounts:
      /var/etcd from etcd-data (rw)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
......

可见,在 Init Containers: 处,其 Command 是一段脚本,包含了一个死循环,只要 nslookup example-bcqztbd6l6.example.my-etcd.svc 有问题(返回值不为0),就会一直循环下去。

在命令行查看log:

$ oc logs example-bcqztbd6l6 -n my-etcd
Defaulted container "etcd" out of: etcd, check-dns (init)
Error from server (BadRequest): container "etcd" in pod "example-bcqztbd6l6" is waiting to start: PodInitializing

说明pod里的 etcd 容器在等待pod初始化。

查看 etcd 容器的log:

$ oc logs example-bcqztbd6l6 -c etcd -n my-etcd
Error from server (BadRequest): container "etcd" in pod "example-bcqztbd6l6" is waiting to start: PodInitializing

只是说在等待pod初始化。从以上log看不出来pod卡在那里了,只能通过 describe pod 看到有段 Init Containers 逻辑,得知 check-dns 容器,然后查看其log:

$ oc logs example-bcqztbd6l6 -c check-dns -n my-etcd
......
nslookup: can't resolve 'example-bcqztbd6l6.example.my-etcd.svc'
Server:    10.217.4.10
Address 1: 10.217.4.10 dns-default.openshift-dns.svc.cluster.local
......

可见,和从web console看到的log是一致的。

分析

为什么会报这个错误呢?我在网上百度了一下,貌似这是 busybox 的一个bug。

参见 https://stackoverflow.com/questions/52109039/nslookup-cant-resolve-kubernetes-default ,里面提到:

You have encountered a bug in the latest versions of the busybox docker image. Use the tag busybox:1.28 instead of latest.

原issue: https://github.com/docker-library/busybox/issues/48

不过也可能是因为我使用的是Red Hat OpenShift Local。没有深究。

参考

  • https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/operators/index#user-tasks

你可能感兴趣的:(openshift,etcd,openshift,ocp,operator)