-
Notifications
You must be signed in to change notification settings - Fork 8.5k
Description
NGINX Ingress controller version: 0.45.0
Kubernetes version (use kubectl version): 1.18.9
Environment:
- Cloud provider or hardware configuration: AWS EKS
- OS (e.g. from /etc/os-release): Bottlerocket OS 1.0.5
- Kernel (e.g.
uname -a): Linux 5.4.80 - Install tools: Ingress Nginx Helm Chart deployed with ArgoCD
- Others: Ingress is running behind NLB with TLS Termination in NLB
What happened:
When we do a restart of the ingress controller (with kubectl rollout restart deployment) the controller incorrectly removes the address value (.status.loadBalancer.ingress[].hostname). The address is eventually added to the manifest again after ~30s.
This appeared after we upgraded from 0.41.2 -> 0.45.0.
What you expected to happen:
That the .status.loadBalancer.ingress[].hostname field would not be removed by the ingress controller on a restart.
How to reproduce it:
Install kind
Install the ingress controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.45.0/deploy/static/provider/baremetal/deploy.yaml
Install an application that will act as default backend (is just an echo app)
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/docs/examples/http-svc.yaml
Create an ingress (please add any additional annotation required)
echo "
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: foo-bar
spec:
rules:
- host: foo.bar
http:
paths:
- backend:
serviceName: http-svc
servicePort: 80
path: /
" | kubectl apply -f -
Restart the ingress controller
Watch the ingress to see the address disappear and reappear:
watch -n 1 kubectl get ingress -A
Restart the controller
kubectl -n ingress-nginx rollout restart deployment ingress-nginx-controller
The following message also appears in the log of the current leader when it shuts down:
ingress-nginx/ingress-nginx-controller-8544f6fcc9-9jjgp[controller]: I0413 13:13:35.582145 7 status.go:132] "removing value from ingress status" address=[172.40.1.2]
Anything else we need to know:
We looked through the code and saw that there was a significant refactoring with the code that figures out if there are any other controller pod instances.
Before the refactor it listed pods based on the hard-coded labels app.kubernetes.io/component, app.kubernetes.io/instance, app.kubernetes.io/name to find other controller pods, but this changed to listing with a selector that is based on all labels assigned to the current pod. This means that the pod-template-hash label is also included in the selector so that the controller does not see the newly created pods and incorrectly assumes that there are no other replicas.
Then I guess it takes ~30s for them to fixed because we need to wait for a new leader to be elected.
/kind bug