Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/reference/cluster_manifest.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,10 @@ Those parameters are grouped under the `metadata` top-level key.
Labels that are set here but not listed as `inherited_labels` in the operator
parameters are ignored.

* **annotations**
A map of annotations to add to the `postgresql` resource. The operator reacts to certain annotations, for instance, to trigger specific actions.
* `postgres-operator.zalando.org/action: restore-in-place`: When this annotation is present with this value, the operator will trigger an automated in-place restore of the cluster. This process requires a valid `clone` section to be defined in the manifest with a target `timestamp`. See the [user guide](../user.md#automated-restore-in-place-point-in-time-recovery) for more details.

## Top-level parameters

These parameters are grouped directly under the `spec` key in the manifest.
Expand Down
6 changes: 6 additions & 0 deletions docs/reference/operator_parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ configuration.
Variable names are underscore-separated words.

### ConfigMaps-based

The configuration is supplied in a
key-value configmap, defined by the `CONFIG_MAP_NAME` environment variable.
Non-scalar values, i.e. lists or maps, are encoded in the value strings using
Expand All @@ -25,6 +26,7 @@ operator CRD, all the CRD defaults are provided in the
[operator's default configuration manifest](https://github.com/zalando/postgres-operator/blob/master/manifests/postgresql-operator-default-configuration.yaml)

### CRD-based configuration

The configuration is stored in a custom YAML
manifest. The manifest is an instance of the custom resource definition (CRD)
called `OperatorConfiguration`. The operator registers this CRD during the
Expand Down Expand Up @@ -187,6 +189,9 @@ Those are top-level keys, containing both leaf keys and groups.
* **repair_period**
period between consecutive repair requests. The default is `5m`.

* **pitr_backup_retention**
retention time for PITR (Point-In-Time-Recovery) state ConfigMaps. The operator will clean up ConfigMaps older than the configured retention. The value is a [duration string](https://pkg.go.dev/time#ParseDuration), e.g. "168h" (which is 7 days), "24h". The default is `168h`.

* **set_memory_request_to_limit**
Set `memory_request` to `memory_limit` for all Postgres clusters (the default
value is also increased but configured `max_memory_request` can not be
Expand Down Expand Up @@ -934,6 +939,7 @@ key.
```yaml
teams_api_role_configuration: "log_statement:all,search_path:'data,public'"
```

The default is `"log_statement:all"`

* **enable_team_superuser**
Expand Down
40 changes: 40 additions & 0 deletions docs/user.md
Original file line number Diff line number Diff line change
Expand Up @@ -891,6 +891,45 @@ original UID, making it possible retry restoring. However, it is probably
better to create a temporary clone for experimenting or finding out to which
point you should restore.

## Automated Restore in place (Point-in-Time Recovery)

The operator supports automated in-place restores, allowing you to restore a database to a specific point in time without changing connection strings on the application side. This feature orchestrates the deletion of the current cluster and the creation of a new one from a backup.

:warning: This is a destructive operation. The existing cluster's StatefulSet and pods will be deleted as part of the process. Ensure you have a reliable backup strategy and have tested the restore process in a non-production environment.

To trigger an in-place restore, you need to add a special annotation and a `clone` section to your `postgresql` manifest:

* **Annotate the manifest**: Add the `postgres-operator.zalando.org/action: restore-in-place` annotation to the `metadata` section.
* **Specify the recovery target**: Add a `clone` section to the `spec`, providing the `cluster` name and the `timestamp` for the point-in-time recovery. The `cluster` name **must** be the same as the `metadata.name` of the cluster you are restoring. The `timestamp` must be in RFC 3339 format and point to a time in the past for which you have WAL archives.

Here is an example manifest snippet:

```yaml
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-minimal-cluster
annotations:
postgres-operator.zalando.org/action: restore-in-place
spec:
# ... other cluster parameters
clone:
cluster: "acid-minimal-cluster" # Must match metadata.name
uid: "<original_UID>"
timestamp: "2022-04-01T10:11:12+00:00"
# ... other cluster parameters
```

When you apply this manifest, the operator will:
* See the `restore-in-place` annotation and begin the restore workflow.
* Store the restore request and the new cluster definition in a temporary `ConfigMap`.
* Delete the existing `postgresql` custom resource, which triggers the deletion of the associated StatefulSet and pods.
* Wait for the old cluster to be fully terminated.
* Create a new `postgresql` resource with a new UID but the same name.
* The new cluster will bootstrap from the latest base backup prior to the given `timestamp` and replay WAL files to recover to the specified point in time.

The process is asynchronous. You can monitor the operator logs and the state of the `postgresql` resource to follow the progress. Once the new cluster is up and running, your applications can reconnect.

## Setting up a standby cluster

Standby cluster is a [Patroni feature](https://github.com/zalando/patroni/blob/master/docs/replica_bootstrap.rst#standby-cluster)
Expand Down Expand Up @@ -1302,3 +1341,4 @@ As of now, the operator does not sync the pooler deployment automatically
which means that changes in the pod template are not caught. You need to
toggle `enableConnectionPooler` to set environment variables, volumes, secret
mounts and securityContext required for TLS support in the pooler pod.

3 changes: 3 additions & 0 deletions manifests/operatorconfiguration.crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,9 @@ spec:
repair_period:
type: string
default: "5m"
pitr_backup_retention:
type: string
default: "168h"
set_memory_request_to_limit:
type: boolean
default: false
Expand Down
1 change: 1 addition & 0 deletions manifests/postgresql-operator-default-configuration.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ configuration:
min_instances: -1
resync_period: 30m
repair_period: 5m
pitr_backup_retention: 168h
# set_memory_request_to_limit: false
# sidecars:
# - image: image:123
Expand Down
5 changes: 3 additions & 2 deletions pkg/apis/acid.zalan.do/v1/operator_configuration_type.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ package v1
// Operator configuration CRD definition, please use snake_case for field names.

import (
"github.com/zalando/postgres-operator/pkg/util/config"

"time"

"github.com/zalando/postgres-operator/pkg/util/config"

"github.com/zalando/postgres-operator/pkg/spec"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
Expand Down Expand Up @@ -267,6 +267,7 @@ type OperatorConfigurationData struct {
ResyncPeriod Duration `json:"resync_period,omitempty"`
RepairPeriod Duration `json:"repair_period,omitempty"`
MaintenanceWindows []MaintenanceWindow `json:"maintenance_windows,omitempty"`
PitrBackupRetention Duration `json:"pitr_backup_retention,omitempty"`
SetMemoryRequestToLimit bool `json:"set_memory_request_to_limit,omitempty"`
ShmVolume *bool `json:"enable_shm_volume,omitempty"`
SidecarImages map[string]string `json:"sidecar_docker_images,omitempty"` // deprecated in favour of SidecarContainers
Expand Down
Loading
Loading