Why I Run Nomad Instead of Kubernetes (And Sleep Better for It)

Every time I mention we run HashiCorp Nomad instead of Kubernetes, I get the same reaction: a slight pause, then “…why?” The assumption is that Kubernetes is the obvious choice for container orchestration, and anything else is either legacy or contrarian.

Neither is true. Here’s the actual reasoning.

The Problem with Defaulting to Kubernetes

Kubernetes is an extraordinary piece of software. It solves genuinely hard problems at genuinely large scale. But it also brings enormous operational complexity — and that complexity has a cost that doesn’t show up until you’re the one maintaining it at 11pm on a Friday.

For a healthcare tech company like ours, with a focused engineering team and a well-understood set of workloads, Kubernetes would have meant a steep learning curve for every engineer who touches infrastructure, YAML sprawl across dozens of resource definitions per service, a control plane to operate and keep healthy, and a significant investment in tooling just to make Kubernetes usable — Helm, ArgoCD, cert-manager, external-secrets…

We didn’t have a Kubernetes problem. We had a “run our .NET services reliably across a fleet of VMs” problem. Those are different problems.

What Nomad Actually Gives Us

Nomad is a single binary. The server cluster is simple to operate. Job definitions are HCL — readable, version-controllable, and learnable in an afternoon. It handles our workloads — Docker containers, raw exec tasks, Java apps — without requiring us to think in terms of pods, deployments, replicasets, and services.

Combined with Consul for service discovery and health checking, we have a service mesh that matches our mental model of what we’re deploying. Services register themselves, health checks propagate automatically, and routing just works. The operational surface area is small enough that any engineer on the team can reason about it. That matters more than almost anything else.

When I Would Choose Kubernetes

I’m not anti-Kubernetes. If we were running dozens of teams deploying hundreds of services independently, or needed the ecosystem of tooling built around it, the calculus would change. But for teams under ~50 engineers with a manageable service count, I’d ask seriously: do you have a Kubernetes-scale problem? Or do you have a “reliably run services” problem that a simpler tool solves better?

Boring technology is good technology. Nomad is boring in the best possible way.

Leave a Comment

Your email address will not be published. Required fields are marked *