Skip to content

The controller manager restarts frequently #1310

@sunjq1

Description

@sunjq1

I started the controller manager applying dlrover/go/operator/config/manifests/bases/deployment.yaml, but found that it restarts frequently.

[root@vadmin14 ~]# kubectl -n dlrover get po 
NAME                                          READY   STATUS    RESTARTS         AGE
dlrover-brain-5b866c8c44-n9cjp                1/1     Running   0                27h
dlrover-controller-manager-5884d84c4d-lz8th   2/2     Running   30 (2m50s ago)   27h
dlrover-kube-monitor-67c4ccf78d-lwmfv         1/1     Running   0                27h
mysql-6877845b96-j8sbg                        1/1     Running   0                27h

view logs:

[root@vadmin14 ~]# kubectl -n dlrover logs dlrover-controller-manager-5884d84c4d-lz8th -f
... ...
E1025 06:38:52.463386       1 leaderelection.go:330] error retrieving resource lock dlrover/9b6611a4.iml.github.io: Get "https://10.66.0.1:443/apis/coordination.k8s.io/v1/namespaces/dlrover/leases/9b6611a4.iml.github.io": context deadline exceeded
I1025 06:38:52.463492       1 leaderelection.go:283] failed to renew lease dlrover/9b6611a4.iml.github.io: timed out waiting for the condition
1.729838332463565e+09   ERROR   setup   problem running manager {"error": "leader election lost"}
main.main
        /workspace/main.go:119
runtime.main
        /usr/local/go/src/runtime/proc.go:250
1.7298383324636865e+09  INFO    Stopping and waiting for non leader election runnables

When I set leader-elect to false, the controller manager stopped restarting.

[root@vadmin14 ~]# kubectl -n dlrover edit deployments.apps dlrover-controller-manager 
... ...
spec:
  replicas: 1
... ...
      - args:
        - --health-probe-bind-address=:8081
        - --metrics-bind-address=127.0.0.1:8080
        - --leader-elect=false
... ...

So, why set leader-elect to true when the number of controller manager replicas is 1?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions