
Your newly spun-up frontend pod wakes up in the cluster with total amnesia. It has a job to do, but it has no idea where it is, who its neighbors are, or how to find the database it desperately needs to query. IP addresses in a Kubernetes cluster change as casually as socks in a gym locker room. Keeping track of them requires a level of bureaucratic masochism that no sane developer wants to deal with.
Someone has to manage this phonebook. That someone lives in a windowless sub-basement namespace called kube-system. It sits there, answering the same questions thousands of times a second, routing traffic, and receiving absolutely zero credit for any of it until something breaks.
That entity is CoreDNS. And this is an exploration of the thankless, absurd, and occasionally heroic life it leads inside your infrastructure.
The temperamental filing cabinet that came before
Before CoreDNS was given the keys to the kingdom in Kubernetes 1.13, the job belonged to kube-dns. It technically worked, much like a rusty fax machine transmits documents. But nobody was happy to see it. kube-dns was not a single, elegant program. It was a chaotic trench coat containing three different containers stacked on top of each other, all whispering to each other to resolve a single address.
CoreDNS replaced it because it is written in Go as a single, compiled binary. It is lighter, faster, and built around a modular plugin architecture. You can bolt on new behaviors to it, much like adding attachments to a vacuum cleaner. It is efficient, utterly devoid of joy, and built to survive the hostile environment of a modern microservices architecture.
Inside the passport control of resolv.conf
When a pod is born, the container runtime shoves a folded piece of paper into its pocket. This piece of paper is the /etc/resolv.conf file. It is the internal passport and instruction manual for how the pod should talk to the outside world.
If you were to exec into a standard web application pod and look at that slip of paper, you would see something resembling this:
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5
At first glance, it looks harmless. The nameserver line simply tells the pod exactly where the CoreDNS operator is sitting. But the rest of this file is a recipe for a spectacular amount of wasted effort.
Look at the search directive. This is a list of default neighborhoods. In human terms, if you tell a courier to deliver a package to “Smith”, the courier does not just give up. The courier checks “Smith in the default namespace”, then “Smith in the general services area”, then “Smith in the local cluster”. It is a highly structured, incredibly repetitive guessing game. Every time your application tries to look up a short name like “inventory-api”, CoreDNS has to physically flip through these directories until it finds a match.
But the true villain of this document, the source of immense invisible suffering, is located at the very bottom.
The loud waiter and the tragedy of ndots
Let us talk about options ndots:5. This single line of configuration is responsible for more wasted network traffic than a teenager downloading 4K video over cellular data.
The ndots value tells your pod how many dots a domain name must have before it is considered an absolute, fully qualified domain name. If a domain has fewer than five dots, the pod assumes it is a local nickname and starts appending the search domains to it.
Imagine a waiter in a crowded, high-end restaurant. You ask this waiter to bring a message to a guest named “google.com”.
Because “google.com” only has one dot, the waiter refuses to believe this is the person’s full legal name. Instead of looking at the master reservation book, the waiter walks into the center of the dining room and screams at the top of his lungs, “IS THERE A GOOGLE.COM IN THE DEFAULT.SVC.CLUSTER.LOCAL NAMESPACE?”
The room goes dead silent. Nobody answers.
Undeterred, the waiter moves to the next search domain and screams, “IS THERE A GOOGLE.COM IN THE SVC.CLUSTER.LOCAL NAMESPACE?”
Again, nothing. The waiter does this a third time for cluster.local. Finally, sweating and out of breath, having annoyed every single patron in the establishment, the waiter says, “Fine. Let me check the outside world for just plain google.com.”
This happens for every single external API call your application makes. Three useless, doomed-to-fail DNS queries hit CoreDNS before your pod even attempts the correct external address. CoreDNS processes these garbage queries with the dead-eyed stare of a DMV clerk stamping “DENIED” on improperly filled forms. If you ever wonder why your cluster DNS latency is slightly terrible, the screaming waiter is usually to blame.
The Corefile and other documents of self-inflicted pain
The rules governing CoreDNS are dictated by a configuration file known as the Corefile. It is essentially the Human Resources policy manual for the DNS server. It defines which plugins are active, who is allowed to ask questions, and where to forward queries it does not understand.
A simplified corporate policy might look like this:
.:53 {
errors
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
}
Most of this is standard bureaucratic routing. The Kubernetes plugin tells CoreDNS how to talk to the Kubernetes API to find out where pods actually live. The cache plugin allows CoreDNS to memorize answers for 30 seconds, so it does not have to bother the API constantly.
But the most fascinating part of this document is the loop plugin.
In complex networks, it is very easy to accidentally configure DNS servers to point at each other. Server A asks Server B, and Server B asks Server A. In a normal corporate environment, two middle managers delegating the same task back and forth will do so indefinitely, drawing salaries for years until retirement.
Software does not have a retirement plan. Left unchecked, a DNS forwarding loop will exponentially consume memory and CPU until the entire node catches fire and dies.
The loop plugin exists solely to detect this exact scenario. It sends a uniquely tagged query out into the world. If that same query comes back to it, CoreDNS realizes it is trapped in a futile, infinite cycle of middle-management delegation.
And what does it do? It refuses to participate. It halts. CoreDNS will intentionally shut itself down rather than perpetuate a stupid system. There is a profound life lesson hiding in that logic. It shows a level of self-awareness and boundary-setting that most human workers never achieve.
Headless services or giving out direct phone numbers
Most of the time, when you ask CoreDNS for a service, it gives you the IP address of a load balancer. You call the front desk, and the receptionist routes your call to an available agent. You do not know who you are talking to, and you do not care.
But some applications are needy. Databases in a cluster, like a Cassandra ring or a MongoDB replica set, cannot just talk to a random receptionist. They need to replicate data. They need to know exactly who they are talking to. They need direct home phone numbers.
This is where CoreDNS provides a feature known as a “headless service”.
When you create a service in Kubernetes, it usually looks like a standard networking request. But when you explicitly add one specific line to the YAML definition, you are effectively firing the receptionist:
apiVersion: v1
kind: Service
metadata:
name: inventory-db
spec:
clusterIP: None # <-- The receptionist's termination letter
selector:
app: database
By setting “clusterIP: None”, you are telling CoreDNS that this department has no front desk.
Now, when a pod asks for “inventory-db”, CoreDNS does not hand out a single routing IP. Instead, it dumps a raw list of the individual IP addresses of every single pod backing that service. Furthermore, it creates a custom, highly specific DNS entry for every individual pod in the StatefulSet.
It assigns them names like “pod-0.inventory-db.production.svc.cluster.local”.
Suddenly, your database node has a personal identity. It can be addressed directly. It is a minor administrative miracle, allowing complex, stateful applications to map out their own internal corporate structure without relying on the cluster’s default routing mechanics.
The unsung hero of the sub-basement
CoreDNS is rarely the subject of glowing keynotes at tech conferences. It does not have the flashy appeal of a service mesh, nor does it generate the architectural excitement of an advanced deployment strategy. It is plumbing. It is bureaucracy. It is the guy in the basement checking names off a clipboard.
But the next time you type a simple, human-readable name into an application configuration file and it flawlessly connects to a database across the cluster, think of CoreDNS. Think of the thousands of fake ndots queries it cheerfully absorbed. Think of its rigid adherence to the Corefile.
And most importantly, respect a piece of software that is smart enough to know when it is stuck in a loop, and brave enough to quit on the spot.









