Node is not properly drained #74

Halama · 2021-09-18T16:38:24Z

Image I'm using:
328549459982.dkr.ecr.eu-west-1.amazonaws.com/bottlerocket-update-operator:v0.1.4

Deployment manifest:

apiVersion: v1
kind: Namespace
metadata:
  name: bottlerocket
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bottlerocket-update-operator-controller
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch", "update", "patch"]
  # Allow the controller to remove Pods running on the Nodes that are updating.
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bottlerocket-update-operator-controller
subjects:
  - kind: ServiceAccount
    name: update-operator-controller
    namespace: bottlerocket
roleRef:
  kind: ClusterRole
  name: bottlerocket-update-operator-controller
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bottlerocket-update-operator-agent
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bottlerocket-update-operator-agent
subjects:
  - kind: ServiceAccount
    name: update-operator-agent
    namespace: bottlerocket
roleRef:
  kind: ClusterRole
  name: bottlerocket-update-operator-agent
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: update-operator-controller
  namespace: bottlerocket
  annotations:
    kubernetes.io/service-account.name: update-operator-controller
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: update-operator-agent
  namespace: bottlerocket
  annotations:
    kubernetes.io/service-account.name: update-operator-agent
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: update-operator-controller
  namespace: bottlerocket
  labels:
    update-operator: controller
spec:
  replicas: 1
  strategy:
    rollingUpdate:
      maxUnavailable: 100%
  selector:
    matchLabels:
      update-operator: controller
  template:
    metadata:
      namespace: bottlerocket
      labels:
        update-operator: controller
        app: bottlerocket-update-operator
      annotations:
        log: "true"
    spec:
      serviceAccountName: update-operator-controller
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: bottlerocket.aws/updater-interface-version
                    operator: Exists
                  - key: "kubernetes.io/os"
                    operator: In
                    values:
                      - linux
                  - key: "kubernetes.io/arch"
                    operator: In
                    values:
                      - amd64
                      - arm64
        # Avoid update-operator's Agent Pods if possible.
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 10
              podAffinityTerm:
                topologyKey: bottlerocket.aws/updater-interface-version
                labelSelector:
                  matchExpressions:
                    - key: update-operator
                      operator: In
                      values: ["agent"]
      containers:
        - name: controller
          image: "328549459982.dkr.ecr.${AWS_REGION}.amazonaws.com/bottlerocket-update-operator:v0.1.4"
          imagePullPolicy: Always
          args:
            - -controller
            - -debug
            - -nodeName
            - $(NODE_NAME)
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
---
# This DaemonSet is for Bottlerocket hosts that support updates through the Bottlerocket API (Bottlerocket OS versions >= v0.4.1)
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: update-operator-agent-update-api
  namespace: bottlerocket
  labels:
    update-operator: agent
spec:
  selector:
    matchLabels:
      update-operator: agent
  template:
    metadata:
      labels:
        update-operator: agent
        app: bottlerocket-update-operator
      annotations:
        log: "true"
    spec:
      serviceAccountName: update-operator-agent
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: "bottlerocket.aws/updater-interface-version"
                    operator: In
                    values:
                      - 2.0.0
                  - key: "kubernetes.io/os"
                    operator: In
                    values:
                      - linux
                  - key: "kubernetes.io/arch"
                    operator: In
                    values:
                      - amd64
                      - arm64
      hostPID: true
      containers:
        - name: agent
          image: "328549459982.dkr.ecr.${AWS_REGION}.amazonaws.com/bottlerocket-update-operator:v0.1.4"
          imagePullPolicy: Always
          args:
            - -agent
            - -debug
            - -nodeName
            - $(NODE_NAME)
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          resources:
            limits:
              memory: 50Mi
            requests:
              cpu: 10m
              memory: 50Mi
          volumeMounts:
            - name: bottlerocket-api-socket
              mountPath: /run/api.sock
          securityContext:
            seLinuxOptions:
              user: system_u
              role: system_r
              type: super_t
              level: s0
      volumes:
        - name: bottlerocket-api-socket
          hostPath:
            path: /run/api.sock
            type: Socket

Node info:

Issue or Feature Request:

Hello, we have noticed that node is not properly drained during update. Update operator doesn't wait until all pods on node are evicted and reboots node immediately which leads to service interruption. The eviction of pods is probably not started at all.

Operator logs could not drain with error User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"kube-system\" followed by proceeding anyway , see more details below.

Update operator log during reboot:

2021-09-18T15:53:16.000Z bottlerocket-update-operator controller--b4c55546b-sp4br time="2021-09-18T15:53:15Z" level=error msg="could not drain" component=controller error="[cannot delete daemonsets.apps \"kube-proxy\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"kube-system\": kube-system/kube-proxy-n2hzd, cannot delete daemonsets.apps \"update-operator-agent-update-api\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"bottlerocket\": bottlerocket/update-operator-agent-update-api-pmv6k, cannot delete daemonsets.apps \"datadog-agent\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"datadog\": datadog/datadog-agent-cqx69, cannot delete daemonsets.apps \"calico-node\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"kube-system\": kube-system/calico-node-s8vrr, cannot delete daemonsets.apps \"fluentd-papertrail-containerd\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"kube-system\": kube-system/fluentd-papertrail-containerd-g29tm]" intent="reboot-update,perform-update,ready update:true" node=ip-10-233-157-101.eu-west-1.compute.internal worker=manager
2021-09-18T15:53:16.000Z bottlerocket-update-operator controller--b4c55546b-sp4br time="2021-09-18T15:53:15Z" level=warning msg="proceeding anyway" component=controller intent="reboot-update,perform-update,ready update:true" node=ip-10-233-157-101.eu-west-1.compute.internal worker=manager

If I add permissions to update controller:

- verbs:
      - get
      - list
    apiGroups:
      - apps
    resources:
      - daemonsets
      - replicasets

Following error is logged cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore):

2021-09-18T16:17:37.000Z bottlerocket-update-operator controller--b4c55546b-sp4br time="2021-09-18T16:17:36Z" level=error msg="could not drain" component=controller error="cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): bottlerocket/update-operator-agent-update-api-l5jvm, datadog/datadog-agent-tch22, kube-system/calico-node-dhn88, kube-system/fluentd-papertrail-containerd-p65zh, kube-system/kube-proxy-mvxcb" intent="reboot-update,perform-update,ready update:true" node=ip-10-233-156-93.eu-west-1.compute.internal worker=manager
2021-09-18T16:17:37.000Z bottlerocket-update-operator controller--b4c55546b-sp4br time="2021-09-18T16:17:36Z" level=warning msg="proceeding anyway" component=controller intent="reboot-update,perform-update,ready update:true" node=ip-10-233-156-93.eu-west-1.compute.internal worker=manager

Can I somehow configure deamonsets ignore on drain in update operator?
thanks

The text was updated successfully, but these errors were encountered:

cbgbt · 2021-09-21T21:07:33Z

Thanks for opening this report. It seems like the operator doesn't handle error responses from the Drain API as one might expect. I'll look into better handling this case.

To clarify, in this case are you anticipating that the operator should drain the DaemonSet pod, or would you rather it ignore DaemonSet pods and wait for the rest to be drained?

Halama · 2021-09-23T10:34:34Z

Hi Sean,
thanks for the reply.

I would like to set the operator to ignore DaemonSet pods, same as kubectl drain ip-10-233-156-21.eu-west-1.compute.internal --delete-local-data --ignore-daemonsets --force

cbgbt · 2022-02-15T18:39:52Z

Hi Martin. This should be fixed in the latest Update Operator release, 0.2.0.

Please let us know if you continue to have issues using the new release. Thanks!

cbgbt added priority/p0 type/bug labels Sep 27, 2021

cbgbt mentioned this issue Jan 19, 2022

Add drain support to the apiserver #137

Merged

3 tasks

cbgbt closed this as completed Feb 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node is not properly drained #74

Node is not properly drained #74

Halama commented Sep 18, 2021 •

edited

Loading

cbgbt commented Sep 21, 2021

Halama commented Sep 23, 2021

cbgbt commented Feb 15, 2022

Node is not properly drained #74

Node is not properly drained #74

Comments

Halama commented Sep 18, 2021 • edited Loading

cbgbt commented Sep 21, 2021

Halama commented Sep 23, 2021

cbgbt commented Feb 15, 2022

Halama commented Sep 18, 2021 •

edited

Loading