Your cluster is not as secure as you think — common K8s security gaps

2026-04-08 Rico Twesten-Weber Principal DevOps Engineer

securitykuberneteshardeningdevops

Last month I ran kube-bench against my homelab K3s cluster. I expected a clean report. I’ve been running Kubernetes professionally for years. I write about security on this blog. I should know better.

I found four out of the five gaps I’m about to describe.

If a DevOps engineer’s personal cluster, one maintained by someone who thinks about this stuff daily, has these issues, your company’s production cluster almost certainly does too.

Gap 1: Pods running as root

This is the most common gap and the easiest to fix. By default, Kubernetes doesn’t enforce any security context on pods. If you don’t specify runAsNonRoot: true, your container runs as root. Most official container images run as root unless the Dockerfile explicitly sets a different user.

Why this matters: a container running as root can modify its own filesystem, install packages, and potentially escape the container boundary through kernel exploits. Container isolation is good, but it’s not perfect. Running as root inside the container means that any escape gives you root on the node.

The fix is straightforward. Add a security context to every pod spec:

securityContext:
  runAsNonRoot: true
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL

readOnlyRootFilesystem prevents the container from writing to its own filesystem, which blocks most post-exploitation techniques. allowPrivilegeEscalation: false prevents processes from gaining more privileges than their parent. Dropping all capabilities removes kernel-level permissions that containers rarely need.

Some applications break with these settings. They need to write to /tmp or specific directories. Use emptyDir volume mounts for those paths instead of opening up the entire filesystem.

Gap 2: No network policies

By default, every pod in a Kubernetes cluster can talk to every other pod. Any namespace, any service. There’s no segmentation and no isolation. A compromised pod in your staging namespace can reach your production database without restriction.

Most teams know this is a problem. Most teams haven’t fixed it.

The approach I recommend: start with a deny-all default policy in every namespace, then explicitly allow only the traffic that’s needed.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

This blocks all inbound and outbound traffic for every pod in the namespace. Then you add specific policies that allow your application pods to reach their databases, your ingress controller to reach your application pods, and your pods to reach external DNS.

It’s tedious work. You need to map every legitimate communication path before you enable deny-all, or you’ll break things. But that mapping exercise is worth doing on its own. If you can’t describe what your pods need to talk to, you don’t understand your system well enough to secure it.

One caveat: network policies require a CNI plugin that supports them. The default kubenet in some distributions doesn’t enforce network policies even if you create them. Calico, Cilium, and Weave all support them. Verify that your CNI actually enforces the policies you write.

Gap 3: ServiceAccount token auto-mounting

Every pod in Kubernetes automatically receives a ServiceAccount token mounted at /var/run/secrets/kubernetes.io/serviceaccount/token. This token allows the pod to authenticate to the Kubernetes API server.

Most pods never use it.

But if an attacker compromises your application and finds that token, they can query the Kubernetes API. Depending on the ServiceAccount’s RBAC permissions (often the default ServiceAccount, which has more access than you’d expect), they can list pods, read secrets, or worse.

The fix:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-app
  namespace: production
automountServiceAccountToken: false

Set automountServiceAccountToken: false on every ServiceAccount that doesn’t need API access. For the few pods that do need to talk to the Kubernetes API (operators, controllers, monitoring agents), create dedicated ServiceAccounts with specific RBAC roles scoped to only what they need.

This is a one-line change that removes an entire attack surface. The fact that Kubernetes doesn’t default to this is a design decision that prioritizes convenience over security.

Gap 4: No resource limits

A pod without resource limits can consume all available CPU and memory on its node. One runaway process or memory leak, and the pod OOM-kills itself along with everything else on that node.

This isn’t hypothetical. I’ve seen a logging sidecar with a memory leak take down a three-node production cluster on a Friday afternoon. The sidecar consumed all memory on its node, Kubernetes tried to reschedule workloads to other nodes, those nodes also had pods without limits, and the cascade killed the cluster in under ten minutes.

resources:
  requests:
    memory: "128Mi"
    cpu: "100m"
  limits:
    memory: "256Mi"
    cpu: "500m"

Set resource requests and limits on every container. Requests determine scheduling (where the pod lands). Limits determine the ceiling (what the pod can consume before it gets throttled or killed).

Getting the numbers right takes observation. Run your workloads, watch actual resource consumption with metrics-server or Prometheus, and set limits based on real usage patterns plus a reasonable buffer. Don’t guess. Don’t copy numbers from a tutorial. Your workload’s resource profile is specific to your workload.

LimitRange objects can set default limits for a namespace, so pods that forget to specify limits still get constrained. This is worth configuring as a safety net.

Gap 5: Stale RBAC

RBAC accumulates. A developer needed cluster-admin access to debug a production issue six months ago. That ClusterRoleBinding still exists. The CI service account from a project you decommissioned in 2024 still has deployment permissions across every namespace. The monitoring tool you evaluated and rejected still has read access to secrets.

Nobody removes RBAC permissions proactively. Someone adds them when something breaks and nobody cleans them up when the need passes.

The fix isn’t technical. It’s procedural. Review RBAC bindings quarterly. For every ClusterRoleBinding and RoleBinding, ask: does this principal still need this access? Is the scope still appropriate? Can this be narrowed?

kubectl get clusterrolebindings -o json | 
  jq '.items[] | select(.roleRef.name=="cluster-admin") | .subjects'

Run that command against your cluster right now. Count the subjects with cluster-admin. If the number surprises you, that’s the gap.

Use tools like kubectl-who-can to audit specific permissions. “Who can delete pods in production?” “Who can read secrets in the kube-system namespace?” The answers are often uncomfortable.

What to do about it

Don’t try to fix all five at once. That’s a recipe for breaking things and losing team buy-in. Instead:

Start with kube-bench. Run the CIS Kubernetes Benchmark against your cluster. It will find these gaps and dozens more, prioritized by severity. It gives you a concrete checklist instead of vague anxiety.

Then adopt Pod Security Standards. Kubernetes has built-in admission control at the namespace level. Set the restricted profile on non-system namespaces. It enforces runAsNonRoot, drops capabilities, and blocks privilege escalation automatically.

Review network policies and RBAC quarterly. Put it on the calendar. Treat it like a recurring maintenance task, not a one-time project. Network topologies change, team members rotate, services get decommissioned. Your security posture needs to track those changes.

Enforce resource limits at the namespace level with LimitRange and ResourceQuota objects. Don’t rely on developers remembering to add limits to every pod spec.

Security isn’t something you achieve once and forget about. It’s a practice. Your cluster drifts toward insecurity unless you actively maintain it. Audit regularly and fix incrementally. You’ll always find something new, and that’s fine. That’s the work.