If you have a very large EKS cluster with tons of pods running on it, you might have encounter DNS resolver errors while querying the AWS VPC DNS resolver. We will talk about the this issue today and how to setup your cluster to mitigate this issue.
Before we dive into the topic, let us look at AWS documentation on VPC DNS Resolver limits (https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html#vpc-dns-limits).
Each EC2 instance can send 1024 packets per second per network interface to Route 53 Resolver (specifically the .2 address, such as 10.0.0.2 and 169.254.169.253). This quota cannot be increased. The number of DNS queries per second supported by Route 53 Resolver varies by the type of query, the size of the response, and the protocol in use.
If you reach the quota, the Route 53 Resolver rejects traffic.
CoreDNS are installed by default when you spin up your EKS cluster. If you run the following command to inspect the deployment of the CoreDNS:
~# kubectl edit deploy coredns -n kube-system
You can see the following specification on affinity for this deployment.
spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: labelSelector: matchExpressions: - key: k8s-app operator: In values: - kube-dns topologyKey: kubernetes.io/hostname weight: 100
With podAntiAffinity (https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity) configured, CoreDNS should not be scheduled on nodes with pods labeled as (k8s-app: kube-dns) already running on it.
This is important as AWS VPC DNS resolver limit is based on each network interface. If your node is using only 1 network interface, this means that scheduling multiple CoreDNS pods on the nodes will not scale the DNS at all as all of them will still be restricted by the 1024 packets per second limit per network interface.
To scale your CoreDNS properly, you should ensure that you have the same amount of pods and nodes (which can schedule the CoreDNS pods) to ensure each pod will be on a individual node itself.
You can check by executing the following command to check if each CoreDNS pod is on a different node in the cluster:
~# kubectl get pods -o wide -n kube-system | grep coredns coredns-6dc57bf577-1tn45 1/1 Running 0 20h 10.1.2.15 ip-10-1-2-15.ap-east-1.compute.internal <none> <none> coredns-6dc57bf577-4wpds 1/1 Running 0 2d 10.1.9.8 ip-10-1-9-8.ap-east-1.compute.internal <none> <none> coredns-6dc57bf577-38hvc 1/1 Running 0 8h 10.1.2.28 ip-10-1-2-28.ap-east-1.compute.internal <none> <none>
Alternatively, you can mitigate this issue by increasing the DNS cache in your application. For example, in your NGINX configuration, you can set the resolver to be:
resolver 172.20.0.10 valid=3600s;
However, please take note that if your NGINX has reverse proxies configuration to AWS Application Load Balancers (ALBs), internet-facing ALBs has dynamic IP addresses which may change during the life of the ALBs. If your cache duration is set to be too long, the updated IP addresses will not be updated until your DNS cache expire and it will cause issues to your environment.