Blog | Crunchy Data

Deploying Active-Active PostgreSQL on Kubernetes

Kubernetes is a very popular container orchestration framework. I’ll show you how to get Symmetric-DS working on a single Kubernetes instance.

I had previously explored how to build an active-active PostgreSQL cluster using Symmetric-DS.  The steps are essentially the same on Kubernetes:

  1. Start 2 PostgreSQL pods
  2. Create a user and a database on each pod/instance
  3. Start  the primary symmetric-ds pod
  4. Add the symmetric-ds triggers and routes
  5. Open registration
  6. Start the secondary symmertic-ds pod

However, there are some interesting nuances I discovered while building out this setup, which I will discuss below. Before diving into the details, I'd also like to credit my colleagues Patrick McLaughlin and Simon Nielly as co-authors for helping me to put together this guide.

Anyway, let's explore how to deploy an active-active PostgreSQL cluster on Kubernetes.

A Guide to Building an Active-Active PostgreSQL Cluster

One of the toughest challenges facing database engineers today is ensuring their data is always accessible so they can meet the high-availability  requirements for their applications.

While this problem may seem trivial in the world of applications where one can have many instances behind geographically distributed load balancers, in the database world where there is only one version of the data globally, the problem faces many obstacles.

PostgreSQL replication has advanced considerably in recent major releases, including continuous improvements to streaming replication and the addition of logical replication in PostgreSQL 10. While these capabilities represent important enhancements to PostgreSQL, enabling users to address a wide variety of clustered architectures, they do not (easily) address the use cases where the application requires access to an updatable database in two or more geographic regions - often referred to as an "active-active" cluster.

More specifically, an active-active cluster is one where the application can write to any instance in the cluster and the data will be written to all of the instances in the cluster, enabling each instance in the cluster to be used to:

  • Provide near zero downtime as the new instance is already in a read/write state; there is no need to reconfigure it.
  • Provide near zero downtime upgrades from one version to another
  • Improve latency for users in geographically distributed clusters. By providing an instance physically closer to the user latency is reduced.

While there are a number of proprietary solutions that attempt to address active-active PostgreSQL requirements, this post and a series to follow provides users with potential reference architectures and configurations that enable active-active PostgreSQL configurations using entirely open source software.

This post is of course only one approach to deploying an active-active PostgreSQL cluster.  There are other ways to deploy an active-active setup with PostgreSQL.  I will cover some ways to do this in the future - stay tuned!