The Multigres Operator is a Kubernetes operator for managing distributed, sharded PostgreSQL clusters across multiple failure domains (zones or regions). It provides a unified API to define the topology of your database system, handling the complex orchestration of shards, cells (failure domains), and gateways.
- Global Cluster Management: Single source of truth (
MultigresCluster) for the entire database topology. - Automated Sharding: Manages
TableGroupsandShardsas first-class citizens. - Failover & High Availability: Orchestrates Primary/Standby failovers across defined Cells.
- Template System: Define configuration once (
CoreTemplate,CellTemplate,ShardTemplate) and reuse it across the cluster. - Hierarchical Defaults: Smart override logic allowing for global defaults, namespace defaults, and granular overrides.
- Integrated Cert Management: Built-in self-signed certificate generation and rotation for validatating webhooks, with optional support for
cert-manager.
- Kubernetes v1.25+
cert-manager(Optional, if using external certificate management)
To install the operator with default settings:
# Install CRDs
make install
# Deploy Operator to the cluster (uses your current kubeconfig context)
make deployFor local testing using Kind, we provide several helper commands:
| Command | Description |
|---|---|
make kind-deploy |
Deploy operator to local Kind cluster using self-signed certs (Default). |
make kind-deploy-certmanager |
Deploy operator to Kind, installing cert-manager for certificate handling. |
make kind-deploy-no-webhook |
Deploy operator to Kind with the webhook fully disabled. |
We provide a set of samples to get you started quickly:
| Sample | Description |
|---|---|
config/samples/minimal.yaml |
A minimal startup cluster for testing. |
config/samples/templated-cluster.yaml |
A full cluster example using templates. |
config/samples/standard-templates.yaml |
Examples of Core, Cell, and Shard templates. |
To apply a sample:
kubectl apply -f config/samples/minimal.yamlThe Multigres Operator follows a Parent/Child architecture. You, the user, manage the Root resource (MultigresCluster) and its shared Templates. The operator automatically creates and reconciles all necessary child resources (Cells, TableGroups, Shards, TopoServers) to match your desired state.
[MultigresCluster] π (Root CR - User Editable)
β
βββ π Defines [TemplateDefaults] (Cluster-wide default templates)
β
βββ π [GlobalTopoServer] (Child CR) β π Uses [CoreTemplate] OR inline [spec]
β
βββ π€ MultiAdmin Resources β π Uses [CoreTemplate] OR inline [spec]
β
βββ π [Cell] (Child CR) β π Uses [CellTemplate] OR inline [spec]
β β
β βββ πͺ MultiGateway Resources
β βββ π‘ [LocalTopoServer] (Child CR, optional)
β
βββ ποΈ [TableGroup] (Child CR)
β
βββ π¦ [Shard] (Child CR) β π Uses [ShardTemplate] OR inline [spec]
β
βββ π§ MultiOrch Resources (Deployment/Pod)
βββ π Pools (StatefulSets for Postgres+MultiPooler)
π [CoreTemplate] (User-editable, scoped config)
βββ globalTopoServer
βββ multiadmin
π [CellTemplate] (User-editable, scoped config)
βββ multigateway
βββ localTopoServer (optional)
π [ShardTemplate] (User-editable, scoped config)
βββ multiorch
βββ pools (postgres + multipooler)
Important:
- Only
MultigresCluster,CoreTemplate,CellTemplate, andShardTemplateare meant to be edited by users. - Child resources (
Cell,TableGroup,Shard,TopoServer) are Read-Only. Any manual changes to them will be immediately reverted by the operator to ensure the system stays in sync with the root configuration.
The operator uses a 4-Level Override Chain to resolve configuration for every component. This allows you to keep your MultigresCluster spec clean while maintaining full control when needed.
When determining the configuration for a component (e.g., a Shard), the operator looks for configuration in this order:
- Inline Spec / Explicit Template Ref: Defined directly on the component in the
MultigresClusterYAML. - Cluster-Level Template Default: Defined in
spec.templateDefaultsof theMultigresCluster. - Namespace-Level Default: A template of the correct kind (e.g.,
ShardTemplate) named"default"in the same namespace. - Operator Hardcoded Defaults: Fallback values built into the operator Webhook.
Templates allow you to define standard configurations (e.g., "Standard High-Availability Cell"). You can then apply specific overrides on top of a template.
Example: Using a Template with Overrides
spec:
cells:
- name: "us-east-1a"
cellTemplate: "standard-ha-cell" # <--- Uses the template
overrides: # <--- Patches specific fields
multigateway:
replicas: 5 # <--- Overrides only the replica countNote on Overrides: When using overrides, you must provide the complete struct for the section you are overriding if it's a pointer. For specific fields like resources, it's safer to ensure you provide the full context if the merge behavior isn't granular enough for your needs (currently, the resolver performs a deep merge).
The operator includes a Mutating and Validating Webhook to enforce defaults and data integrity.
By default, the operator manages its own certificates.
- Bootstrap: On startup, it checks the directory
/var/run/secrets/webhookfor existingtls.crtandtls.keyfiles. - Generation: If none are found, it generates a Self-Signed CA and a Server Certificate.
- Patching: It automatically patches the
MutatingWebhookConfigurationandValidatingWebhookConfigurationin the cluster with the new CA Bundle. - Rotation: Certificates are automatically rotated (default: every 30 days) without downtime.
If you prefer to use cert-manager or another external tool, you must mount the certificates to the operator pod.
- Configuration: Mount a secret containing
tls.crtandtls.keyto/var/run/secrets/webhook. - Behavior: The operator detects the files on disk and disables its internal rotation logic.
Example Patch to Mount External Certs:
apiVersion: apps/v1
kind: Deployment
metadata:
name: multigres-operator-controller-manager
namespace: multigres-system
spec:
template:
spec:
containers:
- name: manager
volumeMounts:
- mountPath: /var/run/secrets/webhook
name: cert-volume
readOnly: true
volumes:
- name: cert-volume
secret:
defaultMode: 420
secretName: my-cert-manager-secret # Your secret nameYou can customize the operator's behavior by passing flags to the binary (or editing the Deployment args).
| Flag | Default | Description |
|---|---|---|
--webhook-enable |
true |
Enable the admission webhook server. |
--webhook-cert-dir |
/var/run/secrets/webhook |
Directory to read/write webhook certificates. |
--webhook-service-name |
multigres-operator-webhook-service |
Name of the Service pointing to the webhook. |
--webhook-service-namespace |
Current Namespace | Namespace of the webhook service. |
--default-core-template |
"default" |
Name of the CoreTemplate to use as namespace fallback. |
--default-cell-template |
"default" |
Name of the CellTemplate to use as namespace fallback. |
--default-shard-template |
"default" |
Name of the ShardTemplate to use as namespace fallback. |
--metrics-bind-address |
"0" |
Address for metrics (set to :8080 to enable). |
--leader-elect |
false |
Enable leader election (recommended for HA deployments). |
Please be aware of the following constraints in the current version:
- Database Limit: Only 1 database is supported per cluster. It must be named
postgresand markeddefault: true. - Shard Naming: Shards currently must be named
0-inf- this is a limitation of the current implementation of Multigres. - Naming Lengths:
- TableGroup Names: If the combined name (
cluster-db-tg) exceeds 28 characters, the operator automatically hashes the database and tablegroup names to ensure that the resulting child resource names (Shards, Pods, StatefulSets) stay within Kubernetes limits (63 chars). - Cluster Name: Recommended to be under 20 characters to ensure that even with hashing, suffixes fit comfortably.
- TableGroup Names: If the combined name (
- Immutable Fields: Some fields like
zoneandregionin Cell definitions are immutable after creation.