May 2026

The social awakening of the Kubernetes scheduler

Human beings are notoriously bad at coordination, but we like to think our machines are better. They are not. For over a decade, Kubernetes, the undisputed king of cloud orchestration, has behaved like a blind restaurant host with a severe case of short-term memory loss.

If you arrived at this restaurant with a party of eight, the host would not look for a table of eight. Instead, they would grab the first person in your group, lead them to a random single stool in the corner, and tell them to wait. Then they would grab the second person and squeeze them between two strangers at the bar. If the remaining guests could not find seats, the host would simply shrug. The first seven would sit there forever, nursing their half-empty glasses of water, while the last person stood shivering in the rain outside.

In computer science, we call this tragedy a scheduling deadlock. In Kubernetes, it is just another Tuesday. But with the release of version 1.36, the system is finally learning some manners through a set of features known as workload-aware scheduling.

The tragedy of scheduling one shoe at a time

Historically, Kubernetes was designed to think in terms of individual pods. To the scheduler, a pod is a single, solitary unit of life, like a lonely left shoe. It does not know or care if there is a right shoe waiting in the queue. It just wants to put the left shoe on a foot, even if the owner of that foot has no legs.

This single-minded approach works beautifully for simple web servers. If you need ten copies of an application, they do not need to know each other. They do not talk, they do not share secrets, and they certainly do not need to hold hands.

But modern workloads, particularly those driving artificial intelligence, machine learning, and massive mathematical calculations, are different. They do not run on lonely, independent pods. They run on highly codependent troupes of containers that must work together or not at all. If you are running an eight-GPU training job, you might need all eight nodes to start at exactly the same microsecond. If seven show up and the eighth is stuck in the hallway because a node ran out of memory, the entire operation grinds to a halt. The active pods just sit there, chewing up expensive processor cycles and doing absolutely nothing useful.

To fix this, the open-source community decided to give Kubernetes some social intelligence. They wanted to teach the system how to recognize a group of friends and seat them all together.

Enter the PodGroup, a unit of social cohesion

To bring order to this chaos, Kubernetes v1.36 introduces a clever piece of psychological separation. It splits the concept of a multi-pod job into two distinct entities, namely a static blueprint called the Workload API and an active, fast-moving runtime object called the PodGroup API.

The separation is brilliant in its boringness. Imagine trying to coordinate a huge family reunion. The Workload is the official invitation list, a static piece of paper detailing who should theoretically show up. The PodGroup is the group text message where everyone argues in real-time about who is actually arriving, who is running late, and who went to the wrong address.

If the scheduler had to update the master blueprint every single time a single pod changed its status, the central API server would suffer the digital equivalent of a massive nervous breakdown. By keeping the blueprint quiet and letting the temporary PodGroup handle the frantic, fast-moving status updates, the system avoids data congestion. It is the architectural equivalent of having a calm office manager who handles the contracts while an assistant runs around screaming with a clipboard.

A basic PodGroup declaration is surprisingly simple, containing just enough information to tell the scheduler how many members actually make a quorum.

apiVersion: scheduling.k8s.io/v1alpha2
kind: PodGroup
metadata:
  name: neural-training-crew
spec:
  minMember: 8

In this little snippet, we are telling Kubernetes that unless all eight of our digital family members can be seated at the table at the exact same time, nobody gets seated at all. The scheduler takes one clean snapshot of the system and commits the whole gang, or nothing. It is, quite literally, collective bargaining for containers.

The art of polite eviction

Of course, life in the cloud is rarely empty. Most of the time, your cluster is already full of small, low-priority pods doing things like sending promotional emails or logging the temperature of the server room.

When your giant, expensive AI training workload arrives at the door, it needs space immediately. In the old days, the scheduler would look at the crowded room, see that there was no space for a group of eight pods, and simply give up.

With workload-aware preemption, the scheduler gains a more assertive personality. Instead of looking at individual pods, it evaluates the entire PodGroup as a single, powerful entity. If the group cannot fit, the scheduler can look at the low-priority pods currently occupying the nodes and decide to evict them.

Crucially, this is controlled by a setting called the disruptionMode. You can configure your PodGroup so that if it must be interrupted, it happens as an all-or-nothing event. Your pods can either be evicted one by one, or they can refuse to leave unless the entire group is taken down together, holding hands in a dramatic show of solidarity. This prevents a situation where half of your training job is evicted, leaving the remaining half running uselessly and burning through your cloud budget.

Putting the family in the same neighborhood

There is one final piece to this scheduling puzzle. In the world of high-performance computing, physical distance matters. If your pods are communicating constantly, placing half of them in an Oregon data center and the other half in Virginia is a recipe for terrible latency. It is like trying to have a conversation where every sentence takes three seconds to travel across the room.

To solve this, Kubernetes v1.36 introduces topology-aware workload scheduling. This ensures that the scheduler does not just find enough seats for your pods, but actually finds them close to one another, preferably on the same network switch, the same rack, or even the same physical machine.

It is the equivalent of booking hotel rooms for your family reunion and ensuring that everyone is on the third floor, rather than scattered across five different buildings in different zip codes.

A short conclusion for the caffeinated reader

We have spent years treating containers like isolated, disposable little boxes. We launched them, forgot about them, and let them fend for themselves. But as our software grows more complex and our artificial intelligence models require more computational power, we are discovering that our containers need to cooperate.

The changes in Kubernetes v1.36 are not just minor performance tweaks. They represent a fundamental shift in how the system understands work. By teaching the scheduler how to recognize groups, respect their physical proximity, and evict them gracefully, Kubernetes is growing up. It is no longer just a system for running individual applications. It is becoming a highly sophisticated, socially aware coordinator for the most complex computational tasks on the planet. And that is definitely worth raising a coffee mug to.

Cheating the continuous learning meat grinder with AI

Let us consider the cranium of the modern Cloud Architect. It is a finite biological container, roughly the size of a cantaloupe, filled with a squishy mass of fat and water. Yet, the tech industry operates under the hallucination that this cantaloupe can effortlessly absorb the entire AWS service catalog updates before your morning coffee. Trying to ingest the sheer volume of new DevOps tooling is a lot like watching a python try to swallow a double-door refrigerator. It is structurally impossible, deeply uncomfortable to witness, and usually ends with someone needing medical attention. We are practically obligated to evolve constantly, but our neurological hard drives have strict, unyielding limits.

The biological absurdity of keeping up with the CNCF landscape

The concept of “continuous improvement” in IT often feels less like an inspirational corporate poster and more like a slightly sadistic evolutionary mandate. You finally understand the esoteric routing logic of your Kubernetes networking setup. Your heart rate settles. You feel peace. Then, a cheerful newsletter arrives to inform you that your setup is obsolete and someone has thrown a brand new service mesh at your head.

The exhaustion you feel is not a character flaw. It is a standard biological response to an ecosystem that mutates faster than a flu virus in a crowded airport. Our brains were optimized for remembering which berries are poisonous, not for tracking the depreciation schedule of Helm charts.

Stop eating the trendy vegetables you hate

Then there is the fear of missing out, or FOMO, which drives otherwise rational engineers to do deeply irrational things. Let us be brutally honest here. If you absolutely despise Javascript or feel a physical wave of nausea when looking at a shiny new frontend framework, do not force yourself to learn them just because they are trending on Hacker News.

Trying to master disciplines outside your actual interests is like forcing a housecat to take up scuba diving. The cat will hate it, it will do a terrible job, and everyone involved will end up bleeding. Protect your cognitive load with ruthless aggression. As a DevOps professional, you have permission to focus solely on the infrastructure pipelines and Linux kernel quirks that actually bring you joy. Leave the trendy stuff to the people who actually like it.

Enter the hyperactive, infinitely patient robot intern

This brings us to the survival strategy. Artificial intelligence is often pitched as an omniscient overlord coming for our jobs. Right now, however, it is much more useful to view it as a hyperactive, infinitely patient intern. These LLMs exist to do the dirty work our cantaloupe brains reject.

They can read the soul-crushing, poorly translated documentation you desperately want to avoid. You can feed a brutal 50-page technical manual on IAM policies into an AI tool and instruct it to spit out a concise summary directly in your terminal. Or better yet, tell it to explain the concepts to you like you are a tired sysadmin who just wants to go home and play with their Mac. It saves hours of mental decay.

Curating your own survival kit

The trick is learning how to interrogate the AI properly. You do not just ask it “what is new in Terraform.” You demand it to extract the protein from the learning material and throw away the useless fat. You can ask it to summarize release notes, generate highly specific flashcards, or even act as a mock interviewer to test your knowledge on specific CI/CD pipelines before a migration. You are outsourcing the most painful parts of the learning curve to a machine that cannot feel pain or boredom.

The fine art of ignoring things

Ultimately, surviving this industry requires a liberating realization. You simply cannot know everything, and attempting to do so is a biological folly. To truly master the fine art of ignoring things, you need to implement a few practical, slightly ruthless habits.

First, practice strategic amnesia. Stop trying to memorize syntax. If an AI can generate the boilerplate YAML for a Kubernetes deployment in three seconds, your brain should actively refuse to store that information. Treat syntax like a disposable coffee cup; use it once and throw it away.

Second, stop hoarding documentation and start hoarding prompts. Your personal knowledge base should not be a graveyard of unread PDFs. It should be a collection of highly tuned, tested instructions that you can feed into an LLM to get exactly what you need, when you need it. Think of them as spells to summon your robot intern.

Third, politely decline the buffet. When a vendor announces a revolutionary new tool that solves a problem you do not actually have, just nod, smile, and walk away. Your cognitive load is precious cargo. Do not fill the cargo bay with garbage.

The ultimate architectural achievement is not memorizing every obscure command line flag. It is building a well structured mind that understands the core principles and knows exactly how to extract the rest of the answers from an AI assistant. Let the machines hold the heavy encyclopedias. We need our brain space for the truly important mysteries, like figuring out why the production database just mysteriously vanished.

Chronicle of a death foretold for the EFK stack in high demand environments

Your monthly cloud infrastructure bill arrives in your inbox. You open the PDF document, and suddenly your left eyelid starts twitching uncontrollably. The finance department has started leaving passive-aggressive sticky notes on your monitor. You realize you are spending the equivalent of a small nation’s gross domestic product just to store text files that repeat “INFO: User logged in” three billion times a day. Welcome to the modern logging crisis.

For years, the Kubernetes logging ecosystem was basically on autopilot. You installed the EFK stack (Elasticsearch, Fluentd, and Kibana), and it just worked. It was the safest default in the industry. But as we navigate through 2026, something has fundamentally ruptured. EFK did not suddenly become toxic waste overnight. It simply became the victim of its own architecture in an era where log volumes have mutated into unrecognizable monsters.

The shift away from EFK is not driven by shiny object syndrome. It is driven by raw economics, hardware exhaustion, and the very human desire not to wake up sweating at 3 AM because a logging cluster ran out of disk space.

The golden retriever in the sausage factory

Let us start with Fluentd. Fluentd is incredibly stable, highly flexible, and has served the community well. However, it is written in Ruby.

Under moderate loads, Fluentd is a perfectly polite guest. But when you expose it to the high-demand environments of modern microservices, Fluentd exhibits the same impulse control as an unsupervised Golden Retriever locked inside a sausage factory. It just eats all your available CPU and RAM until it physically cannot hold any more, burps an Out Of Memory error, and then politely demands that you scale it horizontally.

This operational overhead becomes exhausting. The industry needed something leaner. Enter the OpenTelemetry Collector. Written in Go, it processes telemetry data with the cold, calculated efficiency of an IRS auditor. It handles metrics, traces, and logs in a unified pipeline without treating your server’s memory like an all you can eat buffet.

Here is what a modern, lightweight pipeline configuration looks like today, completely devoid of Ruby overhead:

# OpenTelemetry Collector routing logs without eating your RAM
receivers:
  filelog:
    include: [ /var/log/pods/*/*/*.log ]
exporters:
  clickhouse:
    endpoint: tcp://clickhouse-server:9000
    database: observability
service:
  pipelines:
    logs:
      receivers: [filelog]
      exporters: [clickhouse]

Packing your socks in industrial hangars

The real villain in your cloud bill, however, is not the collector. It is the storage layer. Elasticsearch is an absolute marvel of engineering if you are trying to build a complex search engine for an e-commerce website. But using it exclusively to store application logs is an architectural tragedy.

Storing logs in Elasticsearch is like packing a single pair of socks in an individual cardboard box, wrapping that box in three layers of industrial bubble wrap, and attaching a GPS tracker to it. Yes, the inverted index structure guarantees that you will find those specific socks at the speed of light. But your luggage now occupies three entire aviation hangars, and the monthly rent is absurd. The indexing process creates massive data bloat, multiplying your storage footprint and your anxiety levels simultaneously.

The bouillon cube of observability

This is where ClickHouse enters the scene and aggressively rewrites the rules. ClickHouse looks at your three hangars full of bubble-wrapped socks, throws them into an industrial shredder, and compresses the resulting mass into a super dense data bouillon cube.

ClickHouse relies on columnar storage and sparse indexes. It does not index every single word of your log lines. Instead, it compresses the data so tightly that your storage footprint shrinks to a fraction of what EFK required. And because developers already dream in SQL, querying this massive block of data feels entirely natural.

Instead of wrestling with Kibana’s proprietary query language just to find out why a payment failed, your team can simply run a query like this:

-- Finding errors without going bankrupt
SELECT
    toStartOfMinute(timestamp) AS minute,
    count() AS total_errors,
    dictGet('services', 'name', service_id) AS service_name
FROM application_logs
WHERE level = 'ERROR' AND timestamp > now() - INTERVAL 1 HOUR
GROUP BY minute, service_name
ORDER BY minute DESC;

Grafana sits on top of this SQL engine like a happy gargoyle, providing the exact same dashboarding capabilities you used to get from Kibana, but with the added benefit of seamlessly linking your logs directly to your OpenTelemetry metrics and traces.

Swapping tires on the highway

Now, a word of caution. The worst thing you can do after reading this is to march into your office and delete your Elasticsearch cluster.

Transitioning from EFK to the OpenTelemetry and ClickHouse stack overnight is the IT equivalent of trying to change your car tires while driving at 120 miles per hour down the highway. You will almost certainly lose the chassis in the process.

A migration requires a gradual cutover. You must deploy the OpenTelemetry Collector alongside your existing Fluentd setup. Route a small subset of non critical logs to ClickHouse. Compare the ingestion rates. Let your team practice writing SQL queries to find errors. Only when you are confident that the bouillon cube is holding its shape should you start decommissioning the old, expensive hangars.

When to completely ignore my advice

To be perfectly fair, EFK is not dead for everyone. If your daily log volume fits comfortably on a standard thumb drive, or if your company enjoys setting fire to piles of corporate cash to keep the server room warm, EFK remains a wonderfully easy solution. If your team has zero experience managing relational databases and relies heavily on managed Elasticsearch services, moving to ClickHouse might introduce more friction than it resolves.

But for the rest of the world, the verdict is clear. Do not migrate just because it is trendy. Migrate because your current system has become a financial bottleneck. If your Elasticsearch bill is the fastest growing metric in your entire company, that is your signal. Run the numbers, evaluate the OpenTelemetry stack, and stop paying hangar prices for your socks.