As Kubernetes adoption grows, so does the desire to run everything—including stateful databases—inside clusters. While Kubernetes excels at managing stateless applications, running databases in containers presents unique challenges. This guide explores the best practices for running databases in Kubernetes, offering a deep technical dive for intermediate and advanced users.

Should You Run Databases in Kubernetes?

Before diving into technical details, it’s important to ask: Should you run your databases in Kubernetes?

Pros:

  • Unified platform for stateless and stateful apps
  • Automated scaling and orchestration
  • Easier DevOps pipeline integration
  • Platform-agnostic deployments

Cons:

  • Complexity in storage and persistence
  • Risk of data loss without careful configuration
  • Stateful applications require extra effort for HA

If your team is prepared to handle these challenges, Kubernetes can be a powerful environment for running production-grade databases.

Use StatefulSets for Databases

Kubernetes provides StatefulSets to manage stateful applications like databases. Unlike Deployments, StatefulSets:

  • Assign stable network identities (DNS) to Pods
  • Preserve persistent volumes across Pod restarts
  • Maintain ordered, rolling updates

For example, running a PostgreSQL cluster with StatefulSets ensures that replicas can identify each other via predictable hostnames like:

postgres-0.postgres.default.svc.cluster.local

Persistent Volume Management

Databases need persistent storage that outlives Pods. This is handled using:

  • Persistent Volume Claims (PVCs): Requested by StatefulSets
  • StorageClasses: Define how volumes are provisioned (e.g., AWS EBS, GCP PD, Ceph)
Best Practices
  • Use ReadWriteOnce (RWO) volumes for most single-node databases
  • Choose fast, durable storage (SSD-backed where possible)
  • Enable volume snapshots for disaster recovery
  • Avoid using ephemeral storage for production data

Example PVC template in a StatefulSet:

volumeClaimTemplates:
- metadata:
    name: data
  spec:
    accessModes: ["ReadWriteOnce"]
    resources:
      requests:
        storage: 10Gi
    storageClassName: fast-ssd

Network Configuration and DNS

Databases often require stable endpoints for replication and client connections. StatefulSets offer:

  • Stable Pod DNS (essential for clustering)
  • Headless Services for direct Pod access

Example headless service:

apiVersion: v1
kind: Service
metadata:
  name: postgres
spec:
  clusterIP: None
  selector:
    app: postgres
  ports:
    - port: 5432

This enables discovery of each replica with DNS like postgres-0.postgres, postgres-1.postgres, etc.

Database Configuration and Initialization

Automate initialization with Kubernetes-native patterns:

  • Init Containers: Set up schemas or seed data
  • ConfigMaps/Secrets: Store configuration and credentials
  • Probes: Use liveness and readiness probes to ensure Pod health

Example readiness probe:

readinessProbe:
  exec:
    command: ["pg_isready", "-U", "postgres"]
  initialDelaySeconds: 10
  periodSeconds: 5

Backup and Restore Strategies

Backups are critical in containerized environments. Use one of the following approaches:

  • Run scheduled jobs using CronJobs and tools like pg_dump, mysqldump
  • Use sidecar containers to perform continuous backups to S3 or GCS
  • Take volume snapshots using CSI (Container Storage Interface) drivers
Example: CronJob for MySQL Backup
apiVersion: batch/v1
kind: CronJob
metadata:
  name: mysql-backup
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: mysql
            args:
              - /bin/sh
              - -c
              - "mysqldump -u root -p$MYSQL_PASSWORD mydb > /backup/backup.sql"
            env:
              - name: MYSQL_PASSWORD
                valueFrom:
                  secretKeyRef:
                    name: mysql-secret
                    key: password
          restartPolicy: OnFailure

High Availability (HA) Patterns

Single-node databases can be a SPOF (Single Point of Failure). For HA:

  • Use database-native clustering (e.g., Galera for MySQL, Patroni for PostgreSQL)
  • Combine with PodAntiAffinity to spread Pods across nodes
  • Use PersistentVolumes with replication for data durability
Example: Pod Anti-Affinity
affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - postgres
      topologyKey: "kubernetes.io/hostname"

Security Considerations

Databases hold sensitive data. Secure them with:

  • Secrets for credentials (avoid hardcoding in YAML)
  • TLS encryption for database connections
  • RBAC and Network Policies to restrict access
  • Audit logging and monitoring

Monitoring and Observability

Instrument your database with:

  • Prometheus exporters (e.g., mysqld_exporter, postgres_exporter)
  • Grafana dashboards for performance metrics
  • Kubernetes Events and Logs for debugging

When to Use External DBaaS

In some cases, a managed Database-as-a-Service (DBaaS) may be more appropriate:

  • Need for minimal operational overhead
  • High availability with SLA guarantees
  • Disaster recovery and compliance needs

Use Kubernetes-native tools like ExternalName or Service Mesh for seamless integration with DBaaS.

Conclusion

Running databases in Kubernetes requires thoughtful design around storage, networking, backups, and security. By leveraging StatefulSets, persistent volumes, and best practices like automated backups and readiness probes, you can reliably operate databases in a containerized world. Kubernetes is not just for stateless apps anymore—when done right, it’s a powerful platform for managing stateful services too.