From 3dfbf00dcd7971f551ce857c07a77541eb9f6f18 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Wed, 13 May 2026 20:35:20 +1200 Subject: [PATCH 01/11] plan: replica redaction via dbt-masking manifests --- docs/plans/replica-redaction.md | 315 ++++++++++++++++++++++++++++++++ 1 file changed, 315 insertions(+) create mode 100644 docs/plans/replica-redaction.md diff --git a/docs/plans/replica-redaction.md b/docs/plans/replica-redaction.md new file mode 100644 index 0000000..de83a37 --- /dev/null +++ b/docs/plans/replica-redaction.md @@ -0,0 +1,315 @@ +# Replica redaction via dbt-masking manifests + +## Context + +In `~/code/work/bestool`, the `tamanu psql` command loads a per-version "redaction method" that flags which columns in a Tamanu database hold sensitive data. Today it is **display-only**: bestool fetches a dbt manifest from `https://docs.data.bes.au/tamanu/v{version}/manifest.json`, parses out `source..columns..config.meta.masking`, and renders matching cells as `"***"` in psql output. The database itself is never modified. + +We want PGRO to be able to produce a **redacted replica** — i.e. a restore whose underlying data has actually been anonymised, so any consumer (analytics tools, sandboxes, dev environments) connecting to that replica's Service sees masked data regardless of how they query it. + +The Tamanu-on-bes.au setup is one consumer, but the operator should be **generic over the source of the manifest**: any user who publishes a dbt manifest with the same `meta.masking` annotation schema can point a replica at it. PGRO knows about *the manifest schema*, not about Tamanu, not about bes.au, not about `local_system_facts`. The Tamanu deployment just becomes a particular configuration of the generic feature. + +Approach (per user decisions in planning): + +- Use the **`postgresql_anonymizer`** Postgres extension to do the masking. The manifest provides the *list* of columns to mask, not *how* to mask them; `anon` provides the masking functions (`anon.fake_email()`, `anon.partial(...)`, `anon.random_*`, etc.). +- Drive the redaction step **from inside the operator**, not via a Job. We have `tokio-postgres` and `reqwest` already; the work is fetch-manifest + run-some-SQL, and AGENTS.md prefers operator-driven over scripted-in-Job. +- After redaction completes, **re-enable read-only** so the analytics user can't write to a redacted replica. + +The redaction step plugs into the existing replica lifecycle alongside `persistent_schemas`: restore reaches Ready → redact → schema-migrate → switchover. + +## Manifest schema (the contract) + +PGRO consumes a dbt-shaped JSON document at an HTTP URL. The minimum shape it relies on: + +```json +{ + "sources": { + "": { + "schema": "", + "name": "", + "columns": { + "": { + "config": { "meta": { "masking": } } + } + } + } + } +} +``` + +Each source must carry explicit `schema` and `name` (the Tamanu dbt manifest always emits them — 163/163 sources in the v2.54.3 manifest). Sources missing either field are skipped with a warning. Source keys are otherwise opaque to PGRO. + +### `meta.masking` — canonical contract + +The Tamanu masking spec is documented at https://github.com/beyondessential/tamanu/tree/main/database#masking. It is deliberately implementation-agnostic — descriptions are vague about exact behaviour so implementations (bestool's display-only one, this DB-side one, future others) can vary. The bits PGRO has to honour: + +- **Short and extended form are equivalent.** `masking: name` ≡ `masking: { kind: name }`. The extended form *must* carry `kind`; it may carry additional parameters (currently only `range`). +- **Nulls are preserved.** When a column value is `NULL` it must stay `NULL` after masking. +- **Two locations.** Column-level masks live under `sources..columns..config.meta.masking`. Table-level masks (currently only `truncate`) live under `sources..config.meta.masking` (and `sources..meta.masking` mirrors it). PGRO reads both paths. + +#### Canonical kinds (full list per the docs) + +| kind | scope | behaviour | proposed implementation | +|---|---|---|---| +| `truncate` | **table** | Empty the entire table. | `TRUNCATE TABLE schema.table` as superuser, before the column-mask pass. Not a `SECURITY LABEL`. | +| `date` | column | Anonymise across different dates. Works on `date`/`timestamp(tz)` and on text representations like `character(10)`. | `MASKED WITH FUNCTION anon.random_date()` (or `random_date_between` if we want bounded). Wrap in `CASE WHEN IS NULL THEN NULL ELSE … END` to preserve nulls. | +| `datetime` | column | Anonymise the time-of-day while preserving the date component. Works on `timestamp(tz)` and on text representations like `character(19)`. | Compose: keep `date_trunc('day', )` and add a random interval of seconds. Specifically `date_trunc('day', ) + (floor(random() * 86400) || ' seconds')::interval` (cast as needed for text columns). Null-preserved via CASE. | +| `text` | column | Random words/sentences, approximately the same length as the original. | `anon.lorem_ipsum(characters := length())` (or `words` derived from `length()/6`). | +| `string` | column | Random printable ASCII, no spaces, approximately the same length as the original. | `anon.random_string(length())` if the function accepts a dynamic length, else fall back to a fixed length and accept the deviation. | +| `email`, `name`, `phone`, `place`, `url` | column | Fake data of the indicated shape. | `anon.fake_email()` / `anon.fake_first_name()` / a `partial(, 2, '****', 2)`-style call / `anon.fake_city()` / a constructed URL respectively. For `name`, the docs ask us to inspect whether the original contains a space and use full vs single name — implement with `CASE WHEN LIKE '% %' THEN anon.fake_name() ELSE anon.fake_first_name() END`. | +| `zero` | column | Keep the data length identical but replace with zeroes. Primary use: `bytea`. | Type-dispatched (see "Type-aware planning" below). For `bytea`: `repeat(E'\\x00'::bytea, length())`. For text types: `repeat('0', length())`. For numeric types: `0`. | +| `empty` | column | Delete the value without nulling: `0` for numbers, `''` for strings, `{}` for json(b), `[]` for arrays, etc. | Type-dispatched. The redaction module looks up each masked column's `data_type` in `information_schema.columns` and emits the appropriate `MASKED WITH VALUE …`. | +| `nil` | column | Null the field. The docs note it only applies to nullable columns. | `MASKED WITH VALUE NULL`. Operator skips columns where `is_nullable = 'NO'` and records the skip as a tolerated error. | +| `default` | column | Set the column to its `DEFAULT` value. The docs note it only applies to columns that have a default. | At planning time, look up `pg_get_expr(adbin, adrelid)` from `pg_attrdef` for the column. If present, emit `MASKED WITH VALUE `. If absent, tolerated error. | +| `integer` | column | Random integer. Optional `range: "L-H"` constrains the output. | `floor(random() * (H - L + 1) + L)::int` (or `anon.random_int_between(L, H)` if available). Default range `int4` if unspecified. | +| `float` | column | Random float. Optional `range: "L-H"` constrains. | `(random() * (H - L) + L)::numeric` (or `anon.random_in_numrange('[L,H]'::numrange)`). Default unbounded if unspecified. | +| `money` | column | Like `float`/`integer`, but the value is generated as a float with two decimals for `numeric` columns. | Same as `float`, then `round(, 2)`. | + +#### Range parameter parsing + +`range: "L-H"` is two numbers joined by a hyphen. Parse by splitting on the **last** `-` so floats like `1.001-1.03` decompose correctly. Both halves must parse as `f64`; parse failures are tolerated errors → fall back to the unbounded variant. Per the docs example (`kind: integer, range: 0-10.5`), the operator accepts a decimal range for `kind: integer` and rounds. + +#### Type-aware planning + +For `zero`, `empty`, and `default`, the operator can't decide the right `MASKED WITH VALUE` (or function) without knowing the column type / default. So before issuing any `SECURITY LABEL`, the redaction module runs a single batch query against `information_schema.columns` (and `pg_attrdef` for `default`) to resolve `data_type` and `column_default` for every (schema, table, column) tuple it's about to mask. Columns not present in the DB are dropped (tolerated error). The mapping decisions for those three kinds use this metadata. + +#### Unknown kinds + +If a future manifest version introduces a kind PGRO doesn't recognise (the spec is open-ended), the affected columns are dropped with a tolerated error and the run is reported as `partial`. Adding a new kind is a code change. + +## High-level design + +A new optional `redaction` field on `PostgresPhysicalReplicaSpec`. When set, after a new restore reaches the `Ready` phase the operator runs the redaction step before the restore is eligible for switchover. The step is tracked in status as `redactionPhase` (`pending` → `active` → `complete` / `partial` / `failed`), mirroring how `schemaMigrationPhase` works today. + +Order during a switchover cycle (new restore N replacing active A): +1. N restored from snapshot → `Ready`. +2. **Redaction** runs against N. While running, `default_transaction_read_only` is off on N (we need writes). On success the operator sets it back on via `ALTER DATABASE ... SET default_transaction_read_only = on`. +3. `persistent_schemas` migration A→N (existing behaviour). The schema migration job already runs against N as superuser, so read-only at the DB-default level doesn't block it (superuser is exempt by SET ROLE; `default_transaction_read_only` is a session default, not a hard lock). +4. Switchover Service → N, grace period on A, sweep. + +Redaction runs *before* schema migration so that dbt-style views in persistent schemas can be regenerated against already-redacted source tables on the next dbt run. + +## CRD changes + +`src/types/replica.rs` — add to `PostgresPhysicalReplicaSpec`: + +```rust +/// If set, apply a redaction manifest to the restored data before the +/// replica becomes eligible for switchover. Requires Postgres 18+ and +/// the postgresql_anonymizer extension (loaded via image-volume mount, +/// see plan section "Postgres version gate and extension loading"). +#[serde(default, skip_serializing_if = "Option::is_none")] +pub redaction: Option, +``` + +```rust +#[derive(Debug, Clone, Deserialize, Serialize, JsonSchema)] +#[serde(rename_all = "camelCase")] +pub struct RedactionSpec { + /// HTTP(S) URL of the dbt-style masking manifest. May contain a + /// `{version}` placeholder, in which case `version` or + /// `versionQuery` must be set. + pub manifest_url: String, + + /// Pinned version to substitute into `{version}`. Mutually exclusive + /// with `versionQuery`. + #[serde(default, skip_serializing_if = "Option::is_none")] + pub version: Option, + + /// SQL query that returns a single text column with the version + /// string. Run against the restore's main database as the operator's + /// superuser. Mutually exclusive with `version`. + /// + /// Example (Tamanu): `SELECT value FROM local_system_facts WHERE key = 'currentVersion'` + #[serde(default, skip_serializing_if = "Option::is_none")] + pub version_query: Option, + + /// If the manifest URL with the discovered/pinned version 404s, + /// retry with the major.minor.0 base version. Useful when manifests + /// are only published for minor releases. Defaults to false. + #[serde(default)] + pub version_fallback_to_base: bool, + + /// Override the OCI image used as the source of the + /// postgresql_anonymizer extension files (mounted as an image + /// volume on the restore Pod). Defaults to + /// `registry.gitlab.com/dalibo/postgresql_anon:latest`. + #[serde(default, skip_serializing_if = "Option::is_none")] + pub extension_image: Option, +} +``` + +Validation: if `manifestUrl` contains `{version}`, exactly one of `version` or `versionQuery` must be set; otherwise both must be unset. The operator rejects malformed spec at reconcile time (no admission webhook today). + +Status additions to `PostgresPhysicalReplicaStatus`: + +```rust +/// Phase of redaction: pending, active, complete, partial, failed. +#[serde(default, skip_serializing_if = "Option::is_none")] +pub redaction_phase: Option, + +/// Resolved manifest version used in the last redaction run. +#[serde(default, skip_serializing_if = "Option::is_none")] +pub redaction_version: Option, + +/// Number of columns redacted in the last run. +#[serde(default, skip_serializing_if = "Option::is_none")] +pub redaction_columns_applied: Option, +``` + +Update the readme CRD tables (AGENTS.md exception explicitly allows this). + +### Example (Tamanu) + +```yaml +spec: + redaction: + manifestUrl: "https://docs.data.bes.au/tamanu/v{version}/manifest.json" + versionQuery: "SELECT value FROM local_system_facts WHERE key = 'currentVersion'" + versionFallbackToBase: true +``` + +### Example (pinned) + +```yaml +spec: + redaction: + manifestUrl: "https://example.com/redactions/manifest.json" +``` + +## New module: `src/controllers/replica/redaction.rs` + +Single module exposing: + +```rust +pub async fn reconcile_redaction( + ctx: &Context, + replica: &PostgresPhysicalReplica, + restore_name: &str, +) -> Result; +``` + +Internals: + +1. **Resolve version**. + - If `spec.redaction.version` is set, use it. + - Else if `spec.redaction.versionQuery` is set, connect to the restore as superuser and run the query. Expect a single row with a single text column; error clearly on shape mismatch. + - Else if `manifestUrl` contains `{version}`, fail at validation (covered in CRD validation above). + - Else (no `{version}` in URL), no version is needed. + +2. **Fetch the manifest** via `ctx.http_client` (existing `reqwest::Client`). No caching: redaction runs at most once per restore (i.e. once per scheduled cycle, on the order of hours), so the bandwidth/time saved by caching is negligible against the risk of invalidation bugs. If `versionFallbackToBase` is true and the first fetch is a 404, retry with `{major}.{minor}.0` derived from a `MAJOR.MINOR.PATCH`-shaped version (mirroring bestool's `get_base_version`); silently no-op the fallback for versions that don't match that shape. + +3. **Parse the manifest** with `serde_json::Value`. Read each source's `schema` and `name`. Then collect masks: + - **Table-level**: if `sources..config.meta.masking` (or `meta.masking` as fallback) is `truncate`, add a `TableMask::Truncate` for that source. + - **Column-level**: iterate `sources..columns.*` and collect those with a non-null `config.meta.masking`. Normalise short-form (`"name"`) to extended (`{ "kind": "name" }`). Return: + + ```rust + struct ColumnMask { + schema: String, + table: String, + column: String, + kind: String, // "email", "date", "integer", ... + range: Option<(f64, f64)>, // parsed from "L-H", last-dash split + } + + enum TableMask { Truncate { schema: String, table: String } } + ``` + + Unknown kinds and malformed ranges are kept (carrying their kind string) and rejected later, not at parse time, so the operator can count them as tolerated errors with useful context. + +4. **Type-aware planning**. Open the superuser connection (used in step 5). For every `ColumnMask`, look up `data_type`, `is_nullable`, and `column_default` via a single batch query against `information_schema.columns` LEFT JOIN `pg_attrdef`. Drop entries where the column doesn't exist (tolerated error). The kinds `zero`, `empty`, and `default` use this metadata to choose their `MASKED WITH VALUE …` / function expression; `nil` uses `is_nullable` to skip non-nullable columns; everything else ignores the type. + +5. **Resolve each `ColumnMask` → SECURITY LABEL fragment** per the canonical table. The function returns either: + - `MASKED WITH VALUE ` for kinds that resolve to a constant or pg-default expression (`nil`, `empty`, `default`, `zero`-on-numeric), or + - `MASKED WITH FUNCTION ` for kinds that need a function call (most fakes, `text`, `string`, `datetime`-arithmetic, `integer`/`float`/`money`). + For column kinds where "nulls preserved" matters (most of them; not `nil`/`empty`/`default` which intentionally overwrite nulls per the docs), wrap the expression in `CASE WHEN IS NULL THEN NULL ELSE END`. + +6. **Apply masking via the extension**: + - Open a tokio-postgres connection to the restore as superuser, against the main user database (use `postgres::discover_restore_database`). + - `CREATE EXTENSION IF NOT EXISTS anon CASCADE;` (pulls in pgcrypto). + - `SELECT anon.init();` (loads the fake-data tables; idempotent). + - For each `TableMask::Truncate`: `TRUNCATE TABLE {quote_ident(schema)}.{quote_ident(table)};` Tolerated errors counted. + - For each `ColumnMask` (after type-aware planning): emit the `SECURITY LABEL FOR anon ON COLUMN …` statement. Tolerated errors counted. + - `SELECT anon.anonymize_database();` — destructive in-place rewrite of all labelled columns. + - Leave the extension installed (don't `DROP EXTENSION`). The SECURITY LABELs and the fake-data tables stay around (~7 MB) so analytics consumers can call anon functions themselves if useful. Dynamic masking (`MASKED ROLE`) is **not** enabled — that would need `shared_preload_libraries = 'anon'` and a role-grant step, and is out of scope. + +7. **Re-enable read-only**: + - `ALTER DATABASE {quote_ident(dbname)} SET default_transaction_read_only = on;` — applies to all new sessions on that DB. + - If the wider `spec.read_only` is true and `persistent_schemas` is also set, the existing override (line 686 of `restore/builders.rs`) forces postgresql.conf to `off` cluster-wide; the ALTER DATABASE-level setting still takes effect for non-superuser sessions because it's applied last in the GUC resolution order. The schema-migration Job runs as superuser, so it isn't blocked. Verify this assumption during implementation; if it doesn't hold, fall back to issuing `SELECT pg_reload_conf();` after rewriting postgresql.conf inline. + +Returns `RedactionOutcome { version: Option, columns_attempted, columns_failed }`. + +Errors propagate; the caller writes the status patch. + +## Reconciler wiring + +`src/controllers/replica.rs`: + +- After a new restore reaches `Ready`, before the switchover branch: + - Check the restore's PG version (already populated in `status.postgresVersion`). If `spec.redaction.is_some()` and the version is < 18, set phase `"failed: redaction requires PostgreSQL 18+"` and skip switchover. + - If `spec.redaction.is_some()` and `status.redaction_phase != Some("complete")` and `!= Some("partial")`: + - Set phase `"active"` in status. + - Call `redaction::reconcile_redaction(ctx, replica, &new_restore_name)`. + - On Ok: set phase `"complete"` or `"partial"` (depending on per-statement error count) and store `redaction_version` + `redaction_columns_applied`. + - On Err: set phase `"failed: {msg}"` and return early so the reconciler retries. + - The new restore is not eligible to become the switchover target until phase is `complete` or `partial`. +- Schema migration (`reconcile_schema_migration`) gates on redaction completing first — extend its early-return check so it doesn't kick off until `redaction_phase` is settled (when redaction is configured). +- On every new restore created by the schedule, `redaction_phase` resets to `None` (so the next restore re-runs redaction). + +## Postgres version gate and extension loading + +Redaction is **PG 18+ only**. PG 18 introduces `extension_control_path` and `dynamic_library_path` as runtime-settable GUCs, which lets us mount the extension files via a Kubernetes image volume instead of having to ship a custom Postgres image with anon pre-baked. + +Two enforcement points: + +1. **CRD-level rejection at reconcile time**: when `spec.redaction` is set and the restore's `status.postgresVersion` resolves to anything < 18 (or the cluster's discovered PG version from the snapshot is < 18), set `status.redactionPhase = "failed: redaction requires PostgreSQL 18+"` and refuse the switchover. Don't try to silently bump versions or fall back. + +2. **Extension availability**: when redaction is configured, the operator mounts the postgresql_anonymizer extension files into the restore Pod via a Kubernetes [image volume](https://kubernetes.io/docs/concepts/storage/volumes/#image) (Kubernetes 1.34 is in use, so the feature is GA-available). The restore Pod builder: + + - Adds a volume `image: ` mounted at `/extensions/anon`. Default image: `registry.gitlab.com/dalibo/postgresql_anon:latest`. + - Appends to the generated postgresql.conf: + ``` + extension_control_path = '$system:/extensions/anon/share' + dynamic_library_path = '$libdir:/extensions/anon/lib' + ``` + These GUCs were introduced in PG 18, which is why redaction is gated to PG 18+. + +The redaction reconciler then runs `CREATE EXTENSION anon CASCADE` once Postgres is up; the extension files are already on disk thanks to the volume mount. + +## Tests + +- Unit tests in `redaction.rs`: + - `parse_manifest` round-trip: short-form (`"name"`) and extended-form (`{"kind":"integer","range":"20-50"}`) both normalised to the same `ColumnMask`; table-level `truncate` recognised; missing-schema/missing-name source skipped; unknown kind preserved verbatim. + - `base_version` fallback derivation (mirror bestool's `get_base_version` cases). + - `range` parsing: last-dash split handles `1.001-1.03`, integer rounding for `0-10.5`, parse failures fall back to unbounded. + - Fragment building for each canonical kind, including the type-dispatched ones (`zero` / `empty` / `default` against fixture `data_type` and `column_default` lookups), the `name` space-detection CASE, and the null-preserving CASE wrappers. +- Integration test under `tests/` — **deferred to a follow-up**. The existing kopia-repo fixture snapshots PG 16; redaction requires PG 18+. Landing the test means adding a `setup-kopia-repo-pg18.yaml` fixture, a sample-manifest HTTP server fixture, pre-pulling `postgres:18` and the anon extension image onto the kind node, and a new matrix entry in `.github/workflows/integration.yml`. Each step is straightforward but the combined surface is large enough to be its own change. + +## Files to touch + +| File | Change | +|---|---| +| `src/types/replica.rs` | Add `redaction` to spec (`RedactionSpec` struct); `redaction_phase`/`redaction_version`/`columns_applied` to status. | +| `src/controllers/replica/redaction.rs` (new) | Whole module: manifest fetch/parse, mask-instruction parsing + registry, SQL application, PG-18 version check. | +| `src/controllers/replica.rs` | Wire `reconcile_redaction` into the post-Ready, pre-switchover branch; gate switchover on redaction phase; reset phase on new restore. PG-version gate. | +| `src/controllers/replica/schema_migration.rs` | Make `reconcile_schema_migration` wait for redaction-complete when redaction is configured. | +| `src/controllers/restore/builders.rs` | When `spec.redaction.is_some()`, inject the postgresql_anonymizer image volume into the restore Pod and append `extension_control_path` / `dynamic_library_path` to postgresql.conf. Refuse to build the restore Pod with PG < 18 when redaction is set. | +| `src/controllers/postgres.rs` | No change — `connect_to_restore`, `discover_restore_database`, `quote_ident` already cover what we need. | +| `src/context.rs` | No change. | +| `Cargo.toml` | No new deps (uses existing `tokio-postgres`, `reqwest`, `serde_json`). | +| `README.md` | Update CRD tables only (AGENTS.md explicit exception). | +| `.github/workflows/integration.yml` | Add matrix entry for the new integration test file. | + +## Verification + +- `cargo clippy` and `cargo fmt` clean per AGENTS.md. +- Unit tests pass. +- End-to-end against a real cluster: create a `PostgresPhysicalReplica` with the Tamanu `redaction:` example above, observe the new restore reaches `Ready`, then `redactionPhase` transitions `active` → `complete`, then schema migration runs, then switchover. Connect as the analytics user and confirm: + - The flagged columns return masked values. + - `SELECT pg_settings WHERE name = 'default_transaction_read_only'` is `on` for a fresh analytics session. + - An attempted `INSERT` as analytics user is rejected. + +## Open items / follow-ups + +- **Default extension image tag** — the plan defaults `extensionImage` to `registry.gitlab.com/dalibo/postgresql_anon:latest`. `:latest` is brittle; users who want stability should pin a tag in their spec. Worth revisiting the default to a pinned digest once we know which dalibo build works against the Tamanu manifest in practice. +- **New canonical kinds** — the Tamanu masking spec is intentionally open-ended. If a future manifest introduces a new kind, redaction will report it as a tolerated error and complete as `partial`; adding support is a code change. +- **anon function names** — the SQL fragments above use plausible-but-not-verified names from postgresql_anonymizer. During implementation, validate each against the dalibo docs (or `\df anon.*` in an installed instance) and adjust. Particular ones to double-check: `random_in_int4range` vs `random_int_between`, whether `random_string` accepts a dynamic length, whether `lorem_ipsum` has a `characters :=` parameter, whether `random_date_between` exists. From 8de2a4715795b781cd0a13ad2a8f68222b675db3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Wed, 13 May 2026 20:37:09 +1200 Subject: [PATCH 02/11] feat(replica): add redaction CRD spec and status fields MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Introduce `spec.redaction` on PostgresPhysicalReplica with manifest URL, version discovery (literal or SQL query), base-version fallback, and extension image override. Add status fields tracking redaction phase, resolved version, and column count. No behaviour yet — wiring follows in subsequent commits. --- src/controllers/replica/scheduling.rs | 1 + src/controllers/replica/schema_migration.rs | 1 + src/controllers/restore/tests.rs | 2 + src/types/replica.rs | 57 +++++++++++++++++++++ tests/helpers.rs | 1 + 5 files changed, 62 insertions(+) diff --git a/src/controllers/replica/scheduling.rs b/src/controllers/replica/scheduling.rs index db0beb1..9bb82ca 100644 --- a/src/controllers/replica/scheduling.rs +++ b/src/controllers/replica/scheduling.rs @@ -188,6 +188,7 @@ mod tests { ), persistent_schemas: None, + redaction: None, }, status: Some(PostgresPhysicalReplicaStatus { next_scheduled_restore: next_scheduled.map(Time), diff --git a/src/controllers/replica/schema_migration.rs b/src/controllers/replica/schema_migration.rs index a0b4f87..568852f 100644 --- a/src/controllers/replica/schema_migration.rs +++ b/src/controllers/replica/schema_migration.rs @@ -285,6 +285,7 @@ mod tests { ), persistent_schemas: Some(schemas.into_iter().map(String::from).collect()), + redaction: None, }, status: None, } diff --git a/src/controllers/restore/tests.rs b/src/controllers/restore/tests.rs index a710b60..36365a6 100644 --- a/src/controllers/restore/tests.rs +++ b/src/controllers/restore/tests.rs @@ -39,6 +39,7 @@ fn deployment_uses_affinity_not_node_selector() { notifications: vec![], persistent_schemas: None, + redaction: None, storage_size_maximum: Quantity("2Ti".to_string()), }, ); @@ -123,6 +124,7 @@ fn test_restore_and_replica() -> (PostgresPhysicalRestore, PostgresPhysicalRepli notifications: vec![], persistent_schemas: None, + redaction: None, storage_size_maximum: Quantity("2Ti".to_string()), }, ); diff --git a/src/types/replica.rs b/src/types/replica.rs index 3886a6e..eefc46d 100644 --- a/src/types/replica.rs +++ b/src/types/replica.rs @@ -101,8 +101,52 @@ pub struct PostgresPhysicalReplicaSpec { /// computed size exceeds this limit. Defaults to 2Ti. #[serde(default = "default_storage_size_maximum")] pub storage_size_maximum: Quantity, + + /// If set, apply a redaction manifest to the restored data before the + /// replica becomes eligible for switchover. Requires Postgres 18+ and + /// the postgresql_anonymizer extension (loaded via image-volume mount + /// on the restore Pod). + #[serde(default, skip_serializing_if = "Option::is_none")] + pub redaction: Option, } +#[derive(Debug, Clone, Deserialize, Serialize, JsonSchema)] +#[serde(rename_all = "camelCase")] +pub struct RedactionSpec { + /// HTTP(S) URL of the dbt-style masking manifest. May contain a + /// `{version}` placeholder, in which case `version` or `versionQuery` + /// must be set. + pub manifest_url: String, + + /// Pinned version to substitute into `{version}`. Mutually exclusive + /// with `versionQuery`. + #[serde(default, skip_serializing_if = "Option::is_none")] + pub version: Option, + + /// SQL query that returns a single text column with the version string. + /// Run against the restore's main database as the operator's superuser. + /// Mutually exclusive with `version`. + /// + /// Example (Tamanu): + /// `SELECT value FROM local_system_facts WHERE key = 'currentVersion'` + #[serde(default, skip_serializing_if = "Option::is_none")] + pub version_query: Option, + + /// If the manifest URL with the discovered/pinned version 404s, retry + /// with the major.minor.0 base version. + #[serde(default)] + pub version_fallback_to_base: bool, + + /// Override the OCI image used as the source of the + /// postgresql_anonymizer extension files (mounted as an image volume + /// on the restore Pod). Defaults to + /// `registry.gitlab.com/dalibo/postgresql_anon:latest`. + #[serde(default, skip_serializing_if = "Option::is_none")] + pub extension_image: Option, +} + +pub const DEFAULT_ANON_IMAGE: &str = "registry.gitlab.com/dalibo/postgresql_anon:latest"; + fn default_storage_size_maximum() -> Quantity { Quantity("2Ti".to_string()) } @@ -296,6 +340,19 @@ pub struct PostgresPhysicalReplicaStatus { /// cleared (e.g. by a spec change or manual intervention). #[serde(default, skip_serializing_if = "Option::is_none")] pub consecutive_restore_failures: Option, + + /// Phase of redaction for the current restore: + /// pending, active, complete, partial, failed. + #[serde(default, skip_serializing_if = "Option::is_none")] + pub redaction_phase: Option, + + /// Resolved manifest version used by the last redaction run. + #[serde(default, skip_serializing_if = "Option::is_none")] + pub redaction_version: Option, + + /// Number of columns redacted in the last run. + #[serde(default, skip_serializing_if = "Option::is_none")] + pub redaction_columns_applied: Option, } #[derive(Debug, Clone, Deserialize, Serialize, JsonSchema, PartialEq)] diff --git a/tests/helpers.rs b/tests/helpers.rs index 2c919cb..b5f680e 100644 --- a/tests/helpers.rs +++ b/tests/helpers.rs @@ -159,6 +159,7 @@ pub fn build_replica(name: &str, secret_ref: &str, opts: ReplicaOpts) -> Postgre postgres_extra_config: None, notifications: vec![], persistent_schemas: None, + redaction: None, storage_size_maximum: Quantity("2Ti".to_string()), }, ) From 7eee0ccf460f2b730298a2ea4043de4f1429212d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Wed, 13 May 2026 20:41:07 +1200 Subject: [PATCH 03/11] feat(replica/redaction): add manifest parser, mask registry, and apply layer Introduces the redaction module: - manifest.rs: parses Tamanu/dbt manifests into ColumnMask / TableMask - mask.rs: 13-kind registry mapping each canonical mask to a SECURITY LABEL fragment for postgresql_anonymizer; type-dispatched zero/empty/ default/nil; null-preserving CASE wrappers - apply.rs: applies the parsed manifest against a live restore DB (CREATE EXTENSION, TRUNCATE for table-level, SECURITY LABEL per column, anon.anonymize_database, ALTER DATABASE SET read-only) - redaction.rs: orchestrates fetch -> parse -> apply with version resolution (literal/SQL query) and base-version fallback Reconciler wiring follows in a subsequent commit. --- src/controllers/replica.rs | 1 + src/controllers/replica/redaction.rs | 251 ++++++++++++ src/controllers/replica/redaction/apply.rs | 252 ++++++++++++ src/controllers/replica/redaction/manifest.rs | 291 ++++++++++++++ src/controllers/replica/redaction/mask.rs | 360 ++++++++++++++++++ src/error.rs | 3 + 6 files changed, 1158 insertions(+) create mode 100644 src/controllers/replica/redaction.rs create mode 100644 src/controllers/replica/redaction/apply.rs create mode 100644 src/controllers/replica/redaction/manifest.rs create mode 100644 src/controllers/replica/redaction/mask.rs diff --git a/src/controllers/replica.rs b/src/controllers/replica.rs index 8bf9a8c..21254b8 100644 --- a/src/controllers/replica.rs +++ b/src/controllers/replica.rs @@ -36,6 +36,7 @@ use crate::{ }; use scheduling::ScheduleDecision; +mod redaction; mod resources; mod scheduling; mod schema_migration; diff --git a/src/controllers/replica/redaction.rs b/src/controllers/replica/redaction.rs new file mode 100644 index 0000000..3bf7079 --- /dev/null +++ b/src/controllers/replica/redaction.rs @@ -0,0 +1,251 @@ +//! Replica redaction: fetch a Tamanu/dbt masking manifest and apply it to +//! a freshly-restored Postgres database using the `postgresql_anonymizer` +//! extension. +//! +//! See `docs/plans/replica-redaction.md` for the full design. + +use k8s_openapi::api::core::v1::Secret; +use kube::{Api, ResourceExt as _}; +use tracing::{debug, info, warn}; + +use crate::context::Context; +use crate::controllers::postgres::{ + self, PgConnection, discover_restore_database, read_secret_field, +}; +use crate::error::{Error, Result}; +use crate::types::{PostgresPhysicalReplica, RedactionSpec}; + +use self::manifest::{Manifest, base_version, parse_manifest}; + +pub use self::apply::Outcome; + +mod apply; +pub mod manifest; +pub mod mask; + +const VERSION_PLACEHOLDER: &str = "{version}"; + +/// Run the full redaction step against the given restore. +/// +/// Returns the resolved manifest version (if any) and the apply +/// [`Outcome`]. Errors abort the step and let the reconciler retry on +/// the next pass; per-statement issues during apply are tolerated and +/// surface as `outcome.is_partial()`. +pub async fn reconcile_redaction( + ctx: &Context, + replica: &PostgresPhysicalReplica, + restore_name: &str, +) -> Result<(Option, Outcome)> { + let spec = replica + .spec + .redaction + .as_ref() + .expect("reconcile_redaction called with no spec.redaction"); + + validate_spec(spec)?; + + let namespace = replica.namespace().expect("replica is namespaced"); + + let creds_name = replica.creds_secret_name(); + let secrets: Api = Api::namespaced(ctx.client.clone(), &namespace); + let creds = secrets.get(&creds_name).await?; + let user = read_secret_field(&creds, "username")?; + let password = read_secret_field(&creds, "password")?; + + let dbname = discover_restore_database( + &ctx.client, + &namespace, + restore_name, + &user, + &password, + ctx.use_port_forward(), + ) + .await?; + + let conn = postgres::connect_to_restore( + &ctx.client, + &namespace, + restore_name, + &dbname, + &user, + &password, + ctx.use_port_forward(), + ) + .await?; + + let version = resolve_version(spec, &conn).await?; + let resolved_url = resolve_url(spec, version.as_deref())?; + + info!( + replica = %replica.name_any(), + restore = %restore_name, + url = %resolved_url, + "fetching redaction manifest" + ); + + let manifest = fetch_manifest(ctx, spec, version.as_deref(), &resolved_url).await?; + + info!( + columns = manifest.columns.len(), + tables = manifest.tables.len(), + "manifest parsed" + ); + + let outcome = apply::apply(&conn, &manifest, &dbname).await?; + + Ok((version, outcome)) +} + +fn validate_spec(spec: &RedactionSpec) -> Result<()> { + let templated = spec.manifest_url.contains(VERSION_PLACEHOLDER); + let has_literal = spec.version.is_some(); + let has_query = spec.version_query.is_some(); + + if has_literal && has_query { + return Err(Error::Redaction( + "redaction spec: `version` and `versionQuery` are mutually exclusive".into(), + )); + } + if templated && !(has_literal || has_query) { + return Err(Error::Redaction( + "redaction spec: `manifestUrl` contains `{version}` but no `version` or `versionQuery` provided".into(), + )); + } + if !templated && (has_literal || has_query) { + return Err(Error::Redaction( + "redaction spec: `version`/`versionQuery` set but `manifestUrl` has no `{version}` placeholder".into(), + )); + } + Ok(()) +} + +async fn resolve_version(spec: &RedactionSpec, conn: &PgConnection) -> Result> { + if let Some(v) = spec.version.clone() { + return Ok(Some(v)); + } + let Some(query) = spec.version_query.as_deref() else { + return Ok(None); + }; + + let rows = conn + .client + .simple_query(query) + .await + .map_err(|e| Error::Redaction(format!("versionQuery failed: {e}")))?; + + for msg in rows { + if let tokio_postgres::SimpleQueryMessage::Row(row) = msg { + let value = row + .get(0) + .ok_or_else(|| Error::Redaction("versionQuery returned no columns".into()))?; + return Ok(Some(value.to_string())); + } + } + Err(Error::Redaction("versionQuery returned no rows".into())) +} + +fn resolve_url(spec: &RedactionSpec, version: Option<&str>) -> Result { + match version { + Some(v) => Ok(spec.manifest_url.replace(VERSION_PLACEHOLDER, v)), + None => Ok(spec.manifest_url.clone()), + } +} + +async fn fetch_manifest( + ctx: &Context, + spec: &RedactionSpec, + version: Option<&str>, + url: &str, +) -> Result { + let resp = ctx.http_client.get(url).send().await?; + let status = resp.status(); + + if status == reqwest::StatusCode::NOT_FOUND + && spec.version_fallback_to_base + && let Some(v) = version + && let Some(base) = base_version(v) + { + let base_url = spec.manifest_url.replace(VERSION_PLACEHOLDER, &base); + warn!( + version = %v, + base = %base, + "manifest 404, retrying with base version" + ); + debug!(url = %base_url, "fetching redaction manifest (base)"); + let base_resp = ctx.http_client.get(&base_url).send().await?; + let base_resp = base_resp.error_for_status()?; + let body = base_resp.text().await?; + return parse_manifest(&body).map_err(Into::into); + } + + let resp = resp.error_for_status()?; + let body = resp.text().await?; + parse_manifest(&body).map_err(Into::into) +} + +#[cfg(test)] +mod tests { + use super::*; + + fn spec(url: &str, ver: Option<&str>, vq: Option<&str>) -> RedactionSpec { + RedactionSpec { + manifest_url: url.into(), + version: ver.map(str::to_string), + version_query: vq.map(str::to_string), + version_fallback_to_base: false, + extension_image: None, + } + } + + #[test] + fn validate_accepts_static_url() { + assert!(validate_spec(&spec("https://x/m.json", None, None)).is_ok()); + } + + #[test] + fn validate_accepts_templated_with_literal_version() { + assert!(validate_spec(&spec("https://x/v{version}.json", Some("1.0.0"), None)).is_ok()); + } + + #[test] + fn validate_accepts_templated_with_query() { + assert!(validate_spec(&spec("https://x/v{version}.json", None, Some("SELECT 1"))).is_ok()); + } + + #[test] + fn validate_rejects_templated_without_version() { + assert!(validate_spec(&spec("https://x/v{version}.json", None, None)).is_err()); + } + + #[test] + fn validate_rejects_static_with_version() { + assert!(validate_spec(&spec("https://x/m.json", Some("1.0.0"), None)).is_err()); + } + + #[test] + fn validate_rejects_both_version_and_query() { + assert!( + validate_spec(&spec( + "https://x/v{version}.json", + Some("1.0.0"), + Some("SELECT 1") + )) + .is_err() + ); + } + + #[test] + fn resolve_url_substitutes_version() { + let s = spec("https://x/v{version}/m.json", Some("2.41.0"), None); + assert_eq!( + resolve_url(&s, Some("2.41.0")).unwrap(), + "https://x/v2.41.0/m.json" + ); + } + + #[test] + fn resolve_url_passes_through_when_no_version() { + let s = spec("https://x/m.json", None, None); + assert_eq!(resolve_url(&s, None).unwrap(), "https://x/m.json"); + } +} diff --git a/src/controllers/replica/redaction/apply.rs b/src/controllers/replica/redaction/apply.rs new file mode 100644 index 0000000..db41681 --- /dev/null +++ b/src/controllers/replica/redaction/apply.rs @@ -0,0 +1,252 @@ +//! Apply parsed [`Manifest`] entries against a live restore database. +//! +//! This is the only file in the `redaction` module that talks to a real +//! Postgres. It is invoked by the reconciler after a restore reaches the +//! `Ready` phase and before the switchover branch. + +use std::collections::HashMap; + +use tokio_postgres::types::Type; +use tracing::{info, warn}; + +use super::manifest::Manifest; +use super::mask::{ColumnInfo, ColumnMask, fragment_for}; +use crate::controllers::postgres::{PgConnection, quote_ident}; +use crate::error::{Error, Result}; + +#[derive(Debug, Default)] +pub struct Outcome { + pub columns_attempted: u32, + pub columns_failed: u32, + pub tables_attempted: u32, + pub tables_failed: u32, +} + +impl Outcome { + pub fn is_partial(&self) -> bool { + self.columns_failed > 0 || self.tables_failed > 0 + } +} + +/// Apply a manifest against the live database that `conn` is attached to. +/// +/// The connection must be made as a superuser (CREATE EXTENSION, SECURITY +/// LABEL, TRUNCATE and `anon.anonymize_database()` all require it). +pub async fn apply(conn: &PgConnection, manifest: &Manifest, dbname: &str) -> Result { + let mut outcome = Outcome::default(); + + conn.client + .simple_query("CREATE EXTENSION IF NOT EXISTS anon CASCADE") + .await + .map_err(|e| Error::Redaction(format!("CREATE EXTENSION anon failed: {e}")))?; + + conn.client + .simple_query("SELECT anon.init()") + .await + .map_err(|e| Error::Redaction(format!("anon.init() failed: {e}")))?; + + for table in &manifest.tables { + outcome.tables_attempted += 1; + if table.kind != "truncate" { + warn!( + schema = %table.schema, + table = %table.table, + kind = %table.kind, + "unsupported table-level mask kind, skipping" + ); + outcome.tables_failed += 1; + continue; + } + let stmt = format!( + "TRUNCATE TABLE {}.{} CASCADE", + quote_ident(&table.schema), + quote_ident(&table.table) + ); + if let Err(e) = conn.client.simple_query(&stmt).await { + warn!( + schema = %table.schema, + table = %table.table, + error = %e, + "table truncate failed, continuing" + ); + outcome.tables_failed += 1; + } + } + + let column_infos = lookup_column_infos(conn, &manifest.columns).await?; + + for mask in &manifest.columns { + outcome.columns_attempted += 1; + let info = column_infos.get(&col_key(mask)); + if info.is_none() { + warn!( + schema = %mask.schema, + table = %mask.table, + column = %mask.column, + "column not present in restore, skipping" + ); + outcome.columns_failed += 1; + continue; + } + let fragment = match fragment_for(mask, info) { + Ok(f) => f, + Err(reason) => { + warn!( + schema = %mask.schema, + table = %mask.table, + column = %mask.column, + kind = %mask.kind, + %reason, + "could not build mask fragment, skipping" + ); + outcome.columns_failed += 1; + continue; + } + }; + + let label = format!( + "SECURITY LABEL FOR anon ON COLUMN {}.{}.{} IS {}", + quote_ident(&mask.schema), + quote_ident(&mask.table), + quote_ident(&mask.column), + quote_sql_literal(&fragment.render()), + ); + if let Err(e) = conn.client.simple_query(&label).await { + warn!( + schema = %mask.schema, + table = %mask.table, + column = %mask.column, + error = %e, + "SECURITY LABEL failed, continuing" + ); + outcome.columns_failed += 1; + } + } + + info!( + columns = manifest.columns.len(), + tables = manifest.tables.len(), + failed_columns = outcome.columns_failed, + failed_tables = outcome.tables_failed, + "running anon.anonymize_database()" + ); + + conn.client + .simple_query("SELECT anon.anonymize_database()") + .await + .map_err(|e| Error::Redaction(format!("anon.anonymize_database() failed: {e}")))?; + + let alter = format!( + "ALTER DATABASE {} SET default_transaction_read_only = on", + quote_ident(dbname), + ); + conn.client + .simple_query(&alter) + .await + .map_err(|e| Error::Redaction(format!("re-enabling read-only failed: {e}")))?; + + Ok(outcome) +} + +/// Key used to join `ColumnMask` with the `information_schema` results. +fn col_key(m: &ColumnMask) -> (String, String, String) { + (m.schema.clone(), m.table.clone(), m.column.clone()) +} + +/// Look up `data_type`, `is_nullable`, and `column_default` for every +/// masked column in a single batch query. Columns absent from the +/// restore's schema simply don't appear in the returned map. +async fn lookup_column_infos( + conn: &PgConnection, + masks: &[ColumnMask], +) -> Result> { + let mut out = HashMap::new(); + if masks.is_empty() { + return Ok(out); + } + + let schemas: Vec = masks.iter().map(|m| m.schema.clone()).collect(); + let tables: Vec = masks.iter().map(|m| m.table.clone()).collect(); + let columns: Vec = masks.iter().map(|m| m.column.clone()).collect(); + + let stmt = " + SELECT c.table_schema, c.table_name, c.column_name, + c.data_type, c.is_nullable, + pg_get_expr(d.adbin, d.adrelid) AS column_default + FROM information_schema.columns c + LEFT JOIN pg_catalog.pg_attribute a + ON a.attrelid = (quote_ident(c.table_schema) || '.' || quote_ident(c.table_name))::regclass + AND a.attname = c.column_name + AND NOT a.attisdropped + LEFT JOIN pg_catalog.pg_attrdef d + ON d.adrelid = a.attrelid AND d.adnum = a.attnum + WHERE (c.table_schema, c.table_name, c.column_name) + IN (SELECT s, t, col + FROM UNNEST($1::text[], $2::text[], $3::text[]) + AS u(s, t, col)) + "; + + let rows = conn + .client + .query_typed( + stmt, + &[ + (&schemas, Type::TEXT_ARRAY), + (&tables, Type::TEXT_ARRAY), + (&columns, Type::TEXT_ARRAY), + ], + ) + .await?; + + for row in rows { + let schema: String = row.get("table_schema"); + let table: String = row.get("table_name"); + let column: String = row.get("column_name"); + let data_type: String = row.get("data_type"); + let nullable: String = row.get("is_nullable"); + let default: Option = row.get("column_default"); + + out.insert( + (schema, table, column), + ColumnInfo { + data_type, + is_nullable: nullable == "YES", + column_default: default, + }, + ); + } + + Ok(out) +} + +/// Quote a string for inclusion as a SQL literal (single-quoted). +fn quote_sql_literal(s: &str) -> String { + let escaped = s.replace('\'', "''"); + format!("'{escaped}'") +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn quote_sql_literal_escapes_single_quotes() { + assert_eq!(quote_sql_literal("ab'c"), "'ab''c'"); + } + + #[test] + fn quote_sql_literal_wraps_normal_text() { + assert_eq!(quote_sql_literal("hello"), "'hello'"); + } + + #[test] + fn outcome_is_partial_when_anything_failed() { + let mut o = Outcome::default(); + assert!(!o.is_partial()); + o.columns_failed = 1; + assert!(o.is_partial()); + o.columns_failed = 0; + o.tables_failed = 1; + assert!(o.is_partial()); + } +} diff --git a/src/controllers/replica/redaction/manifest.rs b/src/controllers/replica/redaction/manifest.rs new file mode 100644 index 0000000..45f72bb --- /dev/null +++ b/src/controllers/replica/redaction/manifest.rs @@ -0,0 +1,291 @@ +//! Parse a Tamanu/dbt manifest document into [`ColumnMask`] and +//! [`TableMask`] entries. The shape of the document is defined at +//! . + +use serde_json::Value; +use tracing::warn; + +use super::mask::{ColumnMask, TableMask, parse_range}; + +#[derive(Debug, Default)] +pub struct Manifest { + pub columns: Vec, + pub tables: Vec, +} + +/// Parse a manifest JSON string. Sources missing `schema` or `name` are +/// skipped (warning logged). Column entries with unrecognised mask shapes +/// are kept (carrying the verbatim kind string) so the apply phase can +/// count them as tolerated errors with useful context. +pub fn parse_manifest(json: &str) -> Result { + let doc: Value = serde_json::from_str(json)?; + let mut out = Manifest::default(); + + let Some(sources) = doc.get("sources").and_then(Value::as_object) else { + return Ok(out); + }; + + for (source_id, source) in sources { + let Some(schema) = source.get("schema").and_then(Value::as_str) else { + warn!( + source = source_id, + "manifest source has no `schema`, skipping" + ); + continue; + }; + let Some(name) = source.get("name").and_then(Value::as_str) else { + warn!( + source = source_id, + "manifest source has no `name`, skipping" + ); + continue; + }; + + if let Some(mask) = meta_masking(source) + && let Some(kind) = mask_kind(&mask) + { + out.tables.push(TableMask { + schema: schema.into(), + table: name.into(), + kind: kind.into(), + }); + } + + let Some(columns) = source.get("columns").and_then(Value::as_object) else { + continue; + }; + + for (col_name, col) in columns { + let Some(mask) = meta_masking(col) else { + continue; + }; + let Some(kind) = mask_kind(&mask) else { + warn!( + source = source_id, + column = col_name, + "manifest masking has no `kind`, skipping" + ); + continue; + }; + + let range = mask + .as_object() + .and_then(|o| o.get("range")) + .and_then(Value::as_str) + .and_then(parse_range); + + out.columns.push(ColumnMask { + schema: schema.into(), + table: name.into(), + column: col_name.into(), + kind: kind.into(), + range, + }); + } + } + + Ok(out) +} + +/// Read `.config.meta.masking`, falling back to `.meta.masking`. +fn meta_masking(node: &Value) -> Option { + if let Some(v) = node + .get("config") + .and_then(|c| c.get("meta")) + .and_then(|m| m.get("masking")) + { + return Some(v.clone()); + } + node.get("meta").and_then(|m| m.get("masking")).cloned() +} + +/// Short-form (`"name"`) vs extended-form (`{"kind":"name", …}`) both +/// reduce to a single kind string. +fn mask_kind(v: &Value) -> Option<&str> { + match v { + Value::String(s) => Some(s.as_str()), + Value::Object(o) => o.get("kind").and_then(Value::as_str), + _ => None, + } +} + +/// Derive a base version (`major.minor.0`) from a `MAJOR.MINOR.PATCH` +/// version. Returns `None` if the input doesn't match that shape, or if +/// the patch is already `0`. +pub fn base_version(v: &str) -> Option { + let parts: Vec<&str> = v.split('.').collect(); + if parts.len() != 3 { + return None; + } + let minor: u32 = parts[1].parse().ok()?; + let patch: u32 = parts[2].parse().ok()?; + if patch == 0 { + return None; + } + Some(format!("{}.{}.0", parts[0], minor)) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn parses_short_form_string() { + let m = parse_manifest( + r#"{ + "sources": { + "any.id": { + "schema": "public", + "name": "users", + "columns": { + "email": {"config": {"meta": {"masking": "email"}}} + } + } + } + }"#, + ) + .unwrap(); + + assert_eq!(m.columns.len(), 1); + assert_eq!(m.columns[0].schema, "public"); + assert_eq!(m.columns[0].table, "users"); + assert_eq!(m.columns[0].column, "email"); + assert_eq!(m.columns[0].kind, "email"); + assert_eq!(m.columns[0].range, None); + } + + #[test] + fn parses_extended_form_with_range() { + let m = parse_manifest( + r#"{ + "sources": { + "x": { + "schema": "public", + "name": "vitals", + "columns": { + "heart_rate": {"config":{"meta":{"masking":{"kind":"float","range":"60-200"}}}} + } + } + } + }"#, + ) + .unwrap(); + + assert_eq!(m.columns[0].kind, "float"); + assert_eq!(m.columns[0].range, Some((60.0, 200.0))); + } + + #[test] + fn parses_extended_form_with_float_range() { + let m = parse_manifest( + r#"{ + "sources": { + "x": { + "schema": "public", + "name": "vitals", + "columns": { + "urine_sg": {"config":{"meta":{"masking":{"kind":"float","range":"1.001-1.03"}}}} + } + } + } + }"#, + ) + .unwrap(); + + assert_eq!(m.columns[0].range, Some((1.001, 1.03))); + } + + #[test] + fn parses_table_level_truncate() { + let m = parse_manifest( + r#"{ + "sources": { + "x": { + "schema": "public", + "name": "sync_lookup", + "config": {"meta": {"masking": "truncate"}}, + "columns": {} + } + } + }"#, + ) + .unwrap(); + + assert_eq!(m.tables.len(), 1); + assert_eq!(m.tables[0].schema, "public"); + assert_eq!(m.tables[0].table, "sync_lookup"); + assert_eq!(m.tables[0].kind, "truncate"); + } + + #[test] + fn parses_table_level_truncate_via_meta_fallback() { + let m = parse_manifest( + r#"{ + "sources": { + "x": { + "schema": "public", + "name": "t", + "meta": {"masking": "truncate"}, + "columns": {} + } + } + }"#, + ) + .unwrap(); + + assert_eq!(m.tables.len(), 1); + assert_eq!(m.tables[0].table, "t"); + } + + #[test] + fn skips_source_missing_schema_or_name() { + let m = parse_manifest( + r#"{ + "sources": { + "a": {"name": "t", "columns": {"c": {"config":{"meta":{"masking":"email"}}}}}, + "b": {"schema": "s", "columns": {"c": {"config":{"meta":{"masking":"email"}}}}} + } + }"#, + ) + .unwrap(); + + assert_eq!(m.columns.len(), 0); + } + + #[test] + fn keeps_unknown_kind_verbatim() { + let m = parse_manifest( + r#"{ + "sources": { + "x": { + "schema": "public", + "name": "t", + "columns": { + "c": {"config":{"meta":{"masking":"brand_new"}}} + } + } + } + }"#, + ) + .unwrap(); + + assert_eq!(m.columns[0].kind, "brand_new"); + } + + #[test] + fn base_version_strips_patch() { + assert_eq!(base_version("2.41.7"), Some("2.41.0".to_string())); + } + + #[test] + fn base_version_returns_none_when_patch_is_zero() { + assert_eq!(base_version("2.41.0"), None); + } + + #[test] + fn base_version_returns_none_on_bad_shape() { + assert_eq!(base_version("2.41"), None); + assert_eq!(base_version("not-a-version"), None); + assert_eq!(base_version("2.x.7"), None); + } +} diff --git a/src/controllers/replica/redaction/mask.rs b/src/controllers/replica/redaction/mask.rs new file mode 100644 index 0000000..13cc536 --- /dev/null +++ b/src/controllers/replica/redaction/mask.rs @@ -0,0 +1,360 @@ +//! Mask types parsed out of a Tamanu/dbt manifest, and the registry that +//! turns them into `SECURITY LABEL` fragments for postgresql_anonymizer. +//! +//! The canonical contract for `meta.masking` is documented at +//! . + +use crate::controllers::postgres::quote_ident; + +#[derive(Debug, Clone, PartialEq)] +pub struct ColumnMask { + pub schema: String, + pub table: String, + pub column: String, + pub kind: String, + pub range: Option<(f64, f64)>, +} + +#[derive(Debug, Clone, PartialEq)] +pub struct TableMask { + pub schema: String, + pub table: String, + pub kind: String, +} + +/// Resolved column metadata used by the type-dispatched kinds +/// (`zero`, `empty`, `default`, `nil`). +#[derive(Debug, Clone)] +pub struct ColumnInfo { + pub data_type: String, + pub is_nullable: bool, + pub column_default: Option, +} + +/// The SQL right-hand side of a `SECURITY LABEL … IS ''`. +#[derive(Debug, Clone, PartialEq)] +pub enum Fragment { + Function(String), + Value(String), +} + +impl Fragment { + pub fn render(&self) -> String { + match self { + Self::Function(expr) => format!("MASKED WITH FUNCTION {expr}"), + Self::Value(expr) => format!("MASKED WITH VALUE {expr}"), + } + } +} + +/// Parse `"L-H"` (e.g. `"20-50"`, `"1.001-1.03"`) into a pair of `f64`s, +/// splitting on the **last** `-` so floats decompose correctly. Returns +/// `None` on parse failure. +pub fn parse_range(s: &str) -> Option<(f64, f64)> { + let (lo, hi) = s.rsplit_once('-')?; + let lo: f64 = lo.parse().ok()?; + let hi: f64 = hi.parse().ok()?; + Some((lo, hi)) +} + +/// Build the `Fragment` for a column mask. `info` is only consulted for +/// kinds that need column-type knowledge (`zero`, `empty`, `default`, +/// `nil`); for other kinds it can be `None` (used by unit tests). +/// +/// Returns `Err` with a short diagnostic when the kind is unsupported or +/// when type-dependent kinds are missing required `info`. +pub fn fragment_for(mask: &ColumnMask, info: Option<&ColumnInfo>) -> Result { + let col = quote_ident(&mask.column); + + match mask.kind.as_str() { + "date" => Ok(Fragment::Function(null_pres( + &col, + "anon.random_date()".into(), + ))), + + "datetime" => Ok(Fragment::Function(null_pres( + &col, + format!("date_trunc('day', {col}) + (floor(random() * 86400) || ' seconds')::interval"), + ))), + + "text" => Ok(Fragment::Function(null_pres( + &col, + format!("anon.lorem_ipsum(characters := length({col}))"), + ))), + + "string" => Ok(Fragment::Function(null_pres( + &col, + format!("anon.random_string(length({col}))"), + ))), + + "email" => Ok(Fragment::Function(null_pres( + &col, + "anon.fake_email()".into(), + ))), + + "name" => Ok(Fragment::Function(null_pres( + &col, + format!( + "CASE WHEN {col} LIKE '% %' THEN anon.fake_name() ELSE anon.fake_first_name() END" + ), + ))), + + "phone" => Ok(Fragment::Function(null_pres( + &col, + format!("anon.partial({col}, 2, '****', 2)"), + ))), + + "place" => Ok(Fragment::Function(null_pres( + &col, + "anon.fake_city()".into(), + ))), + + "url" => Ok(Fragment::Function(null_pres( + &col, + "'https://example.invalid/' || anon.random_string(8)".into(), + ))), + + "integer" => { + let (lo, hi) = mask.range.unwrap_or((i32::MIN as f64, i32::MAX as f64)); + Ok(Fragment::Function(null_pres( + &col, + format!("(floor(random() * ({hi} - {lo} + 1)) + {lo})::int"), + ))) + } + + "float" => { + let (lo, hi) = mask.range.unwrap_or((0.0, 1.0)); + Ok(Fragment::Function(null_pres( + &col, + format!("(random() * ({hi} - {lo}) + {lo})::numeric"), + ))) + } + + "money" => { + let (lo, hi) = mask.range.unwrap_or((0.0, 10_000.0)); + Ok(Fragment::Function(null_pres( + &col, + format!("round((random() * ({hi} - {lo}) + {lo})::numeric, 2)"), + ))) + } + + "zero" => { + let info = info.ok_or_else(|| "zero mask needs column type".to_string())?; + match data_type_family(&info.data_type) { + DataTypeFamily::Bytea => Ok(Fragment::Function(format!( + "repeat(E'\\x00'::bytea, length({col}))" + ))), + DataTypeFamily::Text => { + Ok(Fragment::Function(format!("repeat('0', length({col}))"))) + } + DataTypeFamily::Numeric => Ok(Fragment::Value("0".into())), + DataTypeFamily::Other => { + Err(format!("zero mask unsupported for type {}", info.data_type)) + } + } + } + + "empty" => { + let info = info.ok_or_else(|| "empty mask needs column type".to_string())?; + match data_type_family(&info.data_type) { + DataTypeFamily::Numeric => Ok(Fragment::Value("0".into())), + DataTypeFamily::Text => Ok(Fragment::Value("''".into())), + DataTypeFamily::Bytea => Ok(Fragment::Value("E'\\\\x'::bytea".into())), + DataTypeFamily::Other => match info.data_type.as_str() { + "json" | "jsonb" => Ok(Fragment::Value(format!("'{{}}'::{}", info.data_type))), + "ARRAY" => Ok(Fragment::Value("'{}'".into())), + _ => Err(format!( + "empty mask unsupported for type {}", + info.data_type + )), + }, + } + } + + "nil" => { + let info = info.ok_or_else(|| "nil mask needs column type".to_string())?; + if !info.is_nullable { + return Err("nil mask on non-nullable column".into()); + } + Ok(Fragment::Value("NULL".into())) + } + + "default" => { + let info = info.ok_or_else(|| "default mask needs column type".to_string())?; + match info.column_default.as_deref() { + Some(d) => Ok(Fragment::Value(d.into())), + None => Err("default mask on column without default".into()), + } + } + + other => Err(format!("unknown mask kind: {other}")), + } +} + +/// Wrap an expression in a null-preserving CASE. +fn null_pres(col: &str, expr: String) -> String { + format!("CASE WHEN {col} IS NULL THEN NULL ELSE {expr} END") +} + +enum DataTypeFamily { + Numeric, + Text, + Bytea, + Other, +} + +/// Group `information_schema.columns.data_type` strings into the families +/// that determine how `zero`/`empty` are realised. +fn data_type_family(s: &str) -> DataTypeFamily { + match s { + "smallint" | "integer" | "bigint" | "real" | "double precision" | "numeric" | "decimal" => { + DataTypeFamily::Numeric + } + "character varying" | "character" | "text" | "citext" => DataTypeFamily::Text, + "bytea" => DataTypeFamily::Bytea, + _ => DataTypeFamily::Other, + } +} + +#[cfg(test)] +mod tests { + use super::*; + + fn cm(kind: &str, range: Option<(f64, f64)>) -> ColumnMask { + ColumnMask { + schema: "public".into(), + table: "t".into(), + column: "c".into(), + kind: kind.into(), + range, + } + } + + fn info(data_type: &str, nullable: bool, default: Option<&str>) -> ColumnInfo { + ColumnInfo { + data_type: data_type.into(), + is_nullable: nullable, + column_default: default.map(str::to_string), + } + } + + #[test] + fn range_splits_on_last_dash_for_floats() { + assert_eq!(parse_range("1.001-1.03"), Some((1.001, 1.03))); + assert_eq!(parse_range("20-50"), Some((20.0, 50.0))); + assert_eq!(parse_range("0-10.5"), Some((0.0, 10.5))); + } + + #[test] + fn range_handles_negative_lo() { + assert_eq!(parse_range("-5-5"), Some((-5.0, 5.0))); + } + + #[test] + fn range_returns_none_on_garbage() { + assert!(parse_range("nope").is_none()); + assert!(parse_range("1-x").is_none()); + assert!(parse_range("1.2").is_none()); + } + + #[test] + fn fragment_email_is_null_preserving() { + let f = fragment_for(&cm("email", None), None).unwrap(); + let rendered = f.render(); + assert!(rendered.contains("MASKED WITH FUNCTION")); + assert!(rendered.contains("CASE WHEN")); + assert!(rendered.contains("anon.fake_email()")); + } + + #[test] + fn fragment_name_detects_space() { + let f = fragment_for(&cm("name", None), None).unwrap(); + let rendered = f.render(); + assert!(rendered.contains("LIKE '% %'")); + assert!(rendered.contains("fake_name()")); + assert!(rendered.contains("fake_first_name()")); + } + + #[test] + fn fragment_integer_uses_range() { + let f = fragment_for(&cm("integer", Some((20.0, 50.0))), None).unwrap(); + let rendered = f.render(); + assert!(rendered.contains("50 - 20")); + assert!(rendered.contains("::int")); + } + + #[test] + fn fragment_money_rounds_to_two_decimals() { + let f = fragment_for(&cm("money", Some((0.0, 100.0))), None).unwrap(); + assert!(f.render().contains("round(")); + } + + #[test] + fn fragment_zero_for_bytea_repeats() { + let f = fragment_for(&cm("zero", None), Some(&info("bytea", true, None))).unwrap(); + assert!(matches!(f, Fragment::Function(ref s) if s.contains("repeat(E'\\x00'::bytea"))); + } + + #[test] + fn fragment_zero_for_text_repeats_digit() { + let f = fragment_for(&cm("zero", None), Some(&info("text", true, None))).unwrap(); + assert!(matches!(f, Fragment::Function(ref s) if s.contains("repeat('0',"))); + } + + #[test] + fn fragment_zero_for_numeric_is_value_zero() { + let f = fragment_for(&cm("zero", None), Some(&info("integer", true, None))).unwrap(); + assert_eq!(f, Fragment::Value("0".into())); + } + + #[test] + fn fragment_empty_dispatches_on_type() { + assert_eq!( + fragment_for(&cm("empty", None), Some(&info("integer", true, None))).unwrap(), + Fragment::Value("0".into()) + ); + assert_eq!( + fragment_for(&cm("empty", None), Some(&info("text", true, None))).unwrap(), + Fragment::Value("''".into()) + ); + assert_eq!( + fragment_for(&cm("empty", None), Some(&info("jsonb", true, None))).unwrap(), + Fragment::Value("'{}'::jsonb".into()) + ); + } + + #[test] + fn fragment_nil_requires_nullable() { + assert!(fragment_for(&cm("nil", None), Some(&info("text", false, None))).is_err()); + assert_eq!( + fragment_for(&cm("nil", None), Some(&info("text", true, None))).unwrap(), + Fragment::Value("NULL".into()) + ); + } + + #[test] + fn fragment_default_requires_default_expression() { + assert!(fragment_for(&cm("default", None), Some(&info("text", true, None))).is_err()); + assert_eq!( + fragment_for( + &cm("default", None), + Some(&info("text", true, Some("'hello'::text"))), + ) + .unwrap(), + Fragment::Value("'hello'::text".into()) + ); + } + + #[test] + fn fragment_unknown_kind_errors() { + assert!(fragment_for(&cm("brand_new_kind", None), None).is_err()); + } + + #[test] + fn rendered_value_keeps_null_marker() { + assert_eq!( + Fragment::Value("NULL".into()).render(), + "MASKED WITH VALUE NULL" + ); + } +} diff --git a/src/error.rs b/src/error.rs index bcbfb7e..32e05ca 100644 --- a/src/error.rs +++ b/src/error.rs @@ -43,6 +43,9 @@ pub enum Error { #[error("Database connection error: {0}")] Postgres(#[from] tokio_postgres::Error), + + #[error("Redaction error: {0}")] + Redaction(String), } pub type Result = std::result::Result; From 18db8b0114fe33c438d6a7f03aff51a266ee7510 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Wed, 13 May 2026 20:44:47 +1200 Subject: [PATCH 04/11] feat(restore/builders): mount postgresql_anonymizer image volume when redaction is set When spec.redaction is configured on the replica, the restore Pod gets: - an image volume sourcing the postgresql_anonymizer extension files (defaulting to registry.gitlab.com/dalibo/postgresql_anon:latest) - a read-only mount at /extensions/anon - extension_control_path / dynamic_library_path GUCs in postgresql.conf pointing at the mounted paths (PG 18+ features) PG version < 18 is rejected at build time. Read-only enforcement is also deferred (effective_read_only=false) so the redaction step can write; the redaction module re-enables it at the database level when done. --- src/controllers/restore/builders.rs | 157 +++++++++++++++++++--------- src/controllers/restore/tests.rs | 148 ++++++++++++++++++++++++++ 2 files changed, 255 insertions(+), 50 deletions(-) diff --git a/src/controllers/restore/builders.rs b/src/controllers/restore/builders.rs index ef06f89..cdfef23 100644 --- a/src/controllers/restore/builders.rs +++ b/src/controllers/restore/builders.rs @@ -658,6 +658,15 @@ pub fn build_deployment( .cloned() .ok_or_else(|| Error::MissingField("status.postgresVersion".to_string()))?; + if replica.spec.redaction.is_some() { + let major: i32 = pg_version.parse().unwrap_or(0); + if major < 18 { + return Err(Error::Redaction(format!( + "redaction requires PostgreSQL 18+, restore is PG {pg_version}", + ))); + } + } + let pg_image = format!("postgres:{pg_version}"); let locale_script = r#"set -ex @@ -682,19 +691,37 @@ cp -a /usr/lib/locale/* /locale-data/ "# .to_string(); - // persistent_schemas needs write access to receive the migrated data - let effective_read_only = replica.spec.read_only && replica.spec.persistent_schemas.is_none(); + // persistent_schemas and redaction both need write access during their + // post-restore step. Redaction re-enables `default_transaction_read_only` + // at the database level itself after it's done. + let effective_read_only = replica.spec.read_only + && replica.spec.persistent_schemas.is_none() + && replica.spec.redaction.is_none(); let read_only = effective_read_only.to_string(); - let extra_config_block = if let Some(ref extra) = replica.spec.postgres_extra_config { + let mut extra_config = String::new(); + if let Some(ref extra) = replica.spec.postgres_extra_config { + extra_config.push_str(extra); + extra_config.push('\n'); + } + if replica.spec.redaction.is_some() { + // PG 18+ uses these path GUCs to load extensions whose files live + // outside the system extension directories. The dalibo + // postgresql_anonymizer image is mounted at /extensions/anon by + // the Pod builder below. + extra_config.push_str( + "extension_control_path = '$system:/extensions/anon/share/extension'\n\ + dynamic_library_path = '$libdir:/extensions/anon/lib/postgresql/18/lib'\n", + ); + } + let extra_config_block = if extra_config.is_empty() { + String::new() + } else { format!( r#"echo "Appending extra postgresql.conf settings..." cat >> "$PGDATA/postgresql.conf" << 'EXTRACONFEOF' -{extra} -EXTRACONFEOF"# +{extra_config}EXTRACONFEOF"# ) - } else { - String::new() }; let init_script = format!( @@ -1098,23 +1125,34 @@ exec postgres -D /pgdata/pgdata ${PGRO_LOG_LEVEL:+-c log_min_messages=$PGRO_LOG_ protocol: Some("TCP".to_string()), ..Default::default() }]), - volume_mounts: Some(vec![ - VolumeMount { - name: "pgdata".to_string(), - mount_path: "/pgdata".to_string(), - ..Default::default() - }, - VolumeMount { - name: "locale-data".to_string(), - mount_path: "/usr/lib/locale".to_string(), - ..Default::default() - }, - VolumeMount { - name: "dshm".to_string(), - mount_path: "/dev/shm".to_string(), - ..Default::default() - }, - ]), + volume_mounts: Some({ + let mut mounts = vec![ + VolumeMount { + name: "pgdata".to_string(), + mount_path: "/pgdata".to_string(), + ..Default::default() + }, + VolumeMount { + name: "locale-data".to_string(), + mount_path: "/usr/lib/locale".to_string(), + ..Default::default() + }, + VolumeMount { + name: "dshm".to_string(), + mount_path: "/dev/shm".to_string(), + ..Default::default() + }, + ]; + if replica.spec.redaction.is_some() { + mounts.push(VolumeMount { + name: "anon-extension".to_string(), + mount_path: "/extensions/anon".to_string(), + read_only: Some(true), + ..Default::default() + }); + } + mounts + }), readiness_probe: Some(Probe { exec: Some(ExecAction { command: Some(vec![ @@ -1148,35 +1186,54 @@ exec postgres -D /pgdata/pgdata ${PGRO_LOG_LEVEL:+-c log_min_messages=$PGRO_LOG_ resources: replica.spec.resources.clone(), ..Default::default() }], - volumes: Some(vec![ - Volume { - name: "pgdata".to_string(), - persistent_volume_claim: Some( - k8s_openapi::api::core::v1::PersistentVolumeClaimVolumeSource { - claim_name: pvc_name, - read_only: Some(false), - }, - ), - ..Default::default() - }, - Volume { - name: "locale-data".to_string(), - empty_dir: Some( - k8s_openapi::api::core::v1::EmptyDirVolumeSource::default(), - ), - ..Default::default() - }, - Volume { - name: "dshm".to_string(), - empty_dir: Some( - k8s_openapi::api::core::v1::EmptyDirVolumeSource { + volumes: Some({ + let mut volumes = vec![ + Volume { + name: "pgdata".to_string(), + persistent_volume_claim: Some( + k8s_openapi::api::core::v1::PersistentVolumeClaimVolumeSource { + claim_name: pvc_name, + read_only: Some(false), + }, + ), + ..Default::default() + }, + Volume { + name: "locale-data".to_string(), + empty_dir: Some( + k8s_openapi::api::core::v1::EmptyDirVolumeSource::default(), + ), + ..Default::default() + }, + Volume { + name: "dshm".to_string(), + empty_dir: Some( + k8s_openapi::api::core::v1::EmptyDirVolumeSource { medium: Some("Memory".to_string()), size_limit: Some(shm_size), }, - ), - ..Default::default() - }, - ]), + ), + ..Default::default() + }, + ]; + if let Some(ref redaction) = replica.spec.redaction { + let image = redaction + .extension_image + .clone() + .unwrap_or_else(|| crate::types::DEFAULT_ANON_IMAGE.to_string()); + volumes.push(Volume { + name: "anon-extension".to_string(), + image: Some( + k8s_openapi::api::core::v1::ImageVolumeSource { + reference: Some(image), + pull_policy: None, + }, + ), + ..Default::default() + }); + } + volumes + }), affinity: replica.spec.affinity.clone(), tolerations: Some(replica.spec.tolerations.clone()), ..Default::default() diff --git a/src/controllers/restore/tests.rs b/src/controllers/restore/tests.rs index 36365a6..c5e99be 100644 --- a/src/controllers/restore/tests.rs +++ b/src/controllers/restore/tests.rs @@ -646,3 +646,151 @@ fn deployment_shared_buffers_with_custom_resources() { "init script must set shared_buffers for 2Gi request" ); } + +#[test] +fn deployment_without_redaction_has_no_anon_volume() { + let (mut restore, replica) = test_restore_and_replica(); + restore.status = Some(PostgresPhysicalRestoreStatus { + postgres_version: Some("18".to_string()), + ..Default::default() + }); + let deploy = build_deployment(&restore, "test-restore", "default", &replica).unwrap(); + let pod = deploy + .spec + .as_ref() + .unwrap() + .template + .spec + .as_ref() + .unwrap(); + let volume_names: Vec<&str> = pod + .volumes + .as_ref() + .unwrap() + .iter() + .map(|v| v.name.as_str()) + .collect(); + assert!(!volume_names.contains(&"anon-extension")); +} + +#[test] +fn deployment_with_redaction_mounts_anon_image_volume() { + let (mut restore, mut replica) = test_restore_and_replica(); + restore.status = Some(PostgresPhysicalRestoreStatus { + postgres_version: Some("18".to_string()), + ..Default::default() + }); + replica.spec.redaction = Some(RedactionSpec { + manifest_url: "https://example.com/m.json".into(), + version: None, + version_query: None, + version_fallback_to_base: false, + extension_image: None, + }); + + let deploy = build_deployment(&restore, "test-restore", "default", &replica).unwrap(); + let pod = deploy + .spec + .as_ref() + .unwrap() + .template + .spec + .as_ref() + .unwrap(); + + let anon_volume = pod + .volumes + .as_ref() + .unwrap() + .iter() + .find(|v| v.name == "anon-extension") + .expect("anon-extension volume must be present"); + let image = anon_volume.image.as_ref().expect("must be an image volume"); + assert_eq!(image.reference.as_deref(), Some(DEFAULT_ANON_IMAGE)); + + let postgres = &pod.containers[0]; + assert!( + postgres + .volume_mounts + .as_ref() + .unwrap() + .iter() + .any(|m| m.name == "anon-extension" && m.mount_path == "/extensions/anon"), + "postgres container must mount anon-extension at /extensions/anon" + ); + + let setup_auth = deploy_init_setup_auth_script(&deploy); + assert!( + setup_auth.contains("extension_control_path"), + "init script must append extension_control_path GUC" + ); +} + +#[test] +fn deployment_with_redaction_rejects_pg17() { + let (mut restore, mut replica) = test_restore_and_replica(); + restore.status = Some(PostgresPhysicalRestoreStatus { + postgres_version: Some("17".to_string()), + ..Default::default() + }); + replica.spec.redaction = Some(RedactionSpec { + manifest_url: "https://example.com/m.json".into(), + version: None, + version_query: None, + version_fallback_to_base: false, + extension_image: None, + }); + + let err = build_deployment(&restore, "test-restore", "default", &replica).unwrap_err(); + let msg = format!("{err}"); + assert!( + msg.contains("PostgreSQL 18"), + "error should mention PG 18+ requirement, got: {msg}" + ); +} + +#[test] +fn deployment_with_redaction_forces_writable() { + let (mut restore, mut replica) = test_restore_and_replica(); + restore.status = Some(PostgresPhysicalRestoreStatus { + postgres_version: Some("18".to_string()), + ..Default::default() + }); + replica.spec.read_only = true; + replica.spec.redaction = Some(RedactionSpec { + manifest_url: "https://example.com/m.json".into(), + version: None, + version_query: None, + version_fallback_to_base: false, + extension_image: None, + }); + + let deploy = build_deployment(&restore, "test-restore", "default", &replica).unwrap(); + let script = deploy_init_setup_auth_script(&deploy); + // The init script uses `if [ "" = "true" ]` and we want + // that variable substituted to "false" when redaction is set so the + // conditional doesn't fire at runtime. + assert!( + script.contains("if [ \"false\" = \"true\" ]"), + "redaction must defer read-only by substituting read_only=false into the init script" + ); +} + +fn deploy_init_setup_auth_script(deploy: &k8s_openapi::api::apps::v1::Deployment) -> String { + let pod = deploy + .spec + .as_ref() + .unwrap() + .template + .spec + .as_ref() + .unwrap(); + let setup_auth = pod + .init_containers + .as_ref() + .unwrap() + .iter() + .find(|c| c.name == "setup-auth") + .unwrap(); + setup_auth.args.as_ref().unwrap()[0].clone() +} From 38b3491b224ebb0260d3c6bcdb75f9ea282d59d9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Wed, 13 May 2026 20:48:53 +1200 Subject: [PATCH 05/11] feat(replica): wire redaction step into reconcile + sweep gates MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Drives spec.redaction through the reconcile loop: - redaction runs against the switching restore before schema_migration so persistent_schemas dbt views regenerate against redacted source data - redactionPhase tracks the run (active -> complete/partial/failed), with failed:* sticky to avoid retry loops on broken manifests - stale-restore sweep waits on redaction settling, same gate as schema migration - redactionPhase/Version/ColumnsApplied get reset along with the schema fields when the sweep removes the previous restore - on success, when spec.read_only is true: ALTER DATABASE … SET default_transaction_read_only = on, demote analytics to NOSUPERUSER, and GRANT pg_read_all_data — matching the role posture the init script applies when effective_read_only is true --- src/controllers/replica.rs | 27 ++++ src/controllers/replica/redaction.rs | 148 ++++++++++++++++++++- src/controllers/replica/redaction/apply.rs | 41 +++++- 3 files changed, 208 insertions(+), 8 deletions(-) diff --git a/src/controllers/replica.rs b/src/controllers/replica.rs index 21254b8..b42a576 100644 --- a/src/controllers/replica.rs +++ b/src/controllers/replica.rs @@ -163,6 +163,19 @@ pub async fn reconcile(replica: Arc, ctx: Arc) ) }); + // Handle redaction before schema migration: if redaction is set, + // it rewrites the data in place, and any persistent_schemas migration + // pulls from the (already-redacted) source tables. + if replica.spec.redaction.is_some() + && let Some(switching) = switching_restore + { + let redaction_settled = + redaction::reconcile_redaction_step(&ctx, &replica, switching).await?; + if !redaction_settled { + return Ok(Action::requeue(Duration::from_secs(30))); + } + } + // Handle schema migration for persistent_schemas configuration if replica.spec.persistent_schemas.is_some() && let Some(switching) = switching_restore @@ -307,6 +320,16 @@ pub async fn reconcile(replica: Arc, ctx: Arc) true }; + let redaction_settled = if replica.spec.redaction.is_some() { + let phase = replica + .status + .as_ref() + .and_then(|s| s.redaction_phase.as_deref()); + matches!(phase, None | Some("complete") | Some("partial")) + } else { + true + }; + let grace_period = SignedDuration::try_from(replica.spec.switchover_grace_period.0).unwrap_or_default(); let last_completed = replica @@ -326,6 +349,7 @@ pub async fn reconcile(replica: Arc, ctx: Arc) }); if migration_complete + && redaction_settled && has_matching_current && let Some(completed_at) = last_completed && now.duration_since(completed_at.0) > grace_period @@ -369,6 +393,9 @@ pub async fn reconcile(replica: Arc, ctx: Arc) "previousRestore": null, "schemaMigrationJob": null, "schemaMigrationPhase": null, + "redactionPhase": null, + "redactionVersion": null, + "redactionColumnsApplied": null, } }); replicas diff --git a/src/controllers/replica/redaction.rs b/src/controllers/replica/redaction.rs index 3bf7079..1f626c6 100644 --- a/src/controllers/replica/redaction.rs +++ b/src/controllers/replica/redaction.rs @@ -5,7 +5,10 @@ //! See `docs/plans/replica-redaction.md` for the full design. use k8s_openapi::api::core::v1::Secret; -use kube::{Api, ResourceExt as _}; +use kube::{ + Api, ResourceExt as _, + api::{Patch, PatchParams}, +}; use tracing::{debug, info, warn}; use crate::context::Context; @@ -13,7 +16,7 @@ use crate::controllers::postgres::{ self, PgConnection, discover_restore_database, read_secret_field, }; use crate::error::{Error, Result}; -use crate::types::{PostgresPhysicalReplica, RedactionSpec}; +use crate::types::{PostgresPhysicalReplica, PostgresPhysicalRestore, RedactionSpec}; use self::manifest::{Manifest, base_version, parse_manifest}; @@ -25,6 +28,137 @@ pub mod mask; const VERSION_PLACEHOLDER: &str = "{version}"; +/// Reconciler entry point: runs redaction against `switching` if the +/// replica has a redaction spec and the current `redactionPhase` is not +/// already `complete` / `partial` / `failed: …`. Returns `true` when the +/// redaction is settled (complete, partial, or failed — anything that +/// won't change on the next reconcile), `false` when more work is +/// pending and the controller should requeue. +pub async fn reconcile_redaction_step( + ctx: &Context, + replica: &PostgresPhysicalReplica, + switching: &PostgresPhysicalRestore, +) -> Result { + let replica_name = replica.name_any(); + let namespace = replica.namespace().expect("replica is namespaced"); + let phase = replica + .status + .as_ref() + .and_then(|s| s.redaction_phase.as_deref()); + + match phase { + Some("complete") | Some("partial") => return Ok(true), + // `failed: …` is sticky: don't auto-retry. The user clears the + // phase by triggering a new restore (the sweep resets it) or + // editing status manually. Treat it as settled so the + // switchover branch can run if the operator decides to proceed + // without redaction — but `false` here means "redaction is not + // healthy, do not let the switchover proceed". + Some(p) if p.starts_with("failed:") => return Ok(false), + _ => {} + } + + let pg_version = switching + .status + .as_ref() + .and_then(|s| s.postgres_version.as_deref()); + let major: i32 = pg_version.and_then(|v| v.parse().ok()).unwrap_or(0); + if major < 18 { + let msg = format!( + "failed: redaction requires PostgreSQL 18+, restore is PG {}", + pg_version.unwrap_or("unknown") + ); + warn!(replica = %replica_name, version = pg_version, %msg); + patch_phase_only(ctx, &replica_name, &namespace, &msg).await?; + return Ok(false); + } + + if phase != Some("active") { + patch_phase_only(ctx, &replica_name, &namespace, "active").await?; + } + + let switching_name = switching.name_any(); + match reconcile_redaction(ctx, replica, &switching_name).await { + Ok((version, outcome)) => { + let phase = if outcome.is_partial() { + "partial" + } else { + "complete" + }; + info!( + replica = %replica_name, + restore = %switching_name, + phase, + columns_attempted = outcome.columns_attempted, + columns_failed = outcome.columns_failed, + tables_attempted = outcome.tables_attempted, + tables_failed = outcome.tables_failed, + "redaction finished" + ); + patch_settled( + ctx, + &replica_name, + &namespace, + phase, + version.as_deref(), + outcome.columns_attempted, + ) + .await?; + Ok(true) + } + Err(e) => { + let msg = format!("failed: {e}"); + warn!(replica = %replica_name, error = %e, "redaction failed"); + patch_phase_only(ctx, &replica_name, &namespace, &msg).await?; + Ok(false) + } + } +} + +async fn patch_phase_only( + ctx: &Context, + replica_name: &str, + namespace: &str, + phase: &str, +) -> Result<()> { + let replicas: Api = Api::namespaced(ctx.client.clone(), namespace); + let patch = serde_json::json!({ "status": { "redactionPhase": phase } }); + replicas + .patch_status( + replica_name, + &PatchParams::apply("postgres-restore-operator"), + &Patch::Merge(&patch), + ) + .await?; + Ok(()) +} + +async fn patch_settled( + ctx: &Context, + replica_name: &str, + namespace: &str, + phase: &str, + version: Option<&str>, + columns_applied: u32, +) -> Result<()> { + let replicas: Api = Api::namespaced(ctx.client.clone(), namespace); + let patch = serde_json::json!({ + "status": { + "redactionPhase": phase, + "redactionVersion": version, + "redactionColumnsApplied": columns_applied, + } + }); + replicas + .patch_status( + replica_name, + &PatchParams::apply("postgres-restore-operator"), + &Patch::Merge(&patch), + ) + .await?; + Ok(()) +} + /// Run the full redaction step against the given restore. /// /// Returns the resolved manifest version (if any) and the apply @@ -91,7 +225,15 @@ pub async fn reconcile_redaction( "manifest parsed" ); - let outcome = apply::apply(&conn, &manifest, &dbname).await?; + let outcome = apply::apply(&conn, &manifest).await?; + + if replica.spec.read_only { + debug!( + replica = %replica.name_any(), + "re-enabling read-only on redacted database" + ); + apply::enforce_read_only(&conn, &dbname, &replica.spec.analytics_username).await?; + } Ok((version, outcome)) } diff --git a/src/controllers/replica/redaction/apply.rs b/src/controllers/replica/redaction/apply.rs index db41681..e4e96da 100644 --- a/src/controllers/replica/redaction/apply.rs +++ b/src/controllers/replica/redaction/apply.rs @@ -32,7 +32,7 @@ impl Outcome { /// /// The connection must be made as a superuser (CREATE EXTENSION, SECURITY /// LABEL, TRUNCATE and `anon.anonymize_database()` all require it). -pub async fn apply(conn: &PgConnection, manifest: &Manifest, dbname: &str) -> Result { +pub async fn apply(conn: &PgConnection, manifest: &Manifest) -> Result { let mut outcome = Outcome::default(); conn.client @@ -136,16 +136,47 @@ pub async fn apply(conn: &PgConnection, manifest: &Manifest, dbname: &str) -> Re .await .map_err(|e| Error::Redaction(format!("anon.anonymize_database() failed: {e}")))?; - let alter = format!( + Ok(outcome) +} + +/// Lock the freshly-redacted database back to read-only by: +/// - setting the DB-level `default_transaction_read_only` GUC, and +/// - demoting the analytics user back to NOSUPERUSER + granting +/// `pg_read_all_data` (matching the role posture the restore init +/// script applies when `effective_read_only` is true). +pub async fn enforce_read_only( + conn: &PgConnection, + dbname: &str, + analytics_user: &str, +) -> Result<()> { + let alter_db = format!( "ALTER DATABASE {} SET default_transaction_read_only = on", quote_ident(dbname), ); conn.client - .simple_query(&alter) + .simple_query(&alter_db) .await - .map_err(|e| Error::Redaction(format!("re-enabling read-only failed: {e}")))?; + .map_err(|e| Error::Redaction(format!("ALTER DATABASE for read-only failed: {e}")))?; - Ok(outcome) + let demote = format!( + "ALTER ROLE {user} WITH NOSUPERUSER", + user = quote_ident(analytics_user), + ); + conn.client + .simple_query(&demote) + .await + .map_err(|e| Error::Redaction(format!("demoting analytics user failed: {e}")))?; + + let grant = format!( + "GRANT pg_read_all_data TO {user}", + user = quote_ident(analytics_user), + ); + conn.client + .simple_query(&grant) + .await + .map_err(|e| Error::Redaction(format!("granting pg_read_all_data failed: {e}")))?; + + Ok(()) } /// Key used to join `ColumnMask` with the `information_schema` results. From 693bfe4e5ebbbbabfa54137698cf3d797f52ab9d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Wed, 13 May 2026 20:49:37 +1200 Subject: [PATCH 06/11] docs(readme): document RedactionSpec field and redaction status fields --- README.md | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/README.md b/README.md index cbd8bb2..f3746f1 100644 --- a/README.md +++ b/README.md @@ -103,6 +103,7 @@ Defines a continuously-refreshed replica of a PostgreSQL database restored from | `postgresExtraConfig` | `string` | No | — | Extra lines appended to `postgresql.conf` (e.g. `shared_preload_libraries`). | | `notifications` | `[]NotificationConfig` | No | `[]` | Notification targets called on restore events. | | `persistentSchemas` | `[]string` | No | — | List of schema names to migrate from the previous restore to the new restore on each switchover. | +| `redaction` | `RedactionSpec` | No | — | If set, apply a Tamanu/dbt-shaped masking manifest to the restored data via the `postgresql_anonymizer` extension before switchover. Requires PostgreSQL 18+. | The cron expression is parsed using the [cronexpr](https://docs.rs/cronexpr) crate. It has two interesting features: @@ -114,6 +115,34 @@ The jitter is a random duration between -time/2 and +time/2. For example, `10m` will result in a jitter between -5m and 5m. When using `H` in the cron expression, you might want to set the jitter to zero to properly take advantage of the spread-but-stable behaviour. +#### RedactionSpec + +Configures applying a column-masking manifest to the restored data using the [postgresql_anonymizer](https://gitlab.com/dalibo/postgresql_anonymizer) extension. +The manifest follows the [Tamanu masking spec](https://github.com/beyondessential/tamanu/tree/main/database#masking) — any dbt project that publishes the same `meta.masking` annotation shape can be pointed at. + +| Field | Type | Required | Default | Description | +|-------|------|----------|---------|-------------| +| `manifestUrl` | `string` | Yes | — | HTTP(S) URL of the masking manifest. May contain a literal `{version}` placeholder. | +| `version` | `string` | No | — | Pinned version substituted into `{version}`. Mutually exclusive with `versionQuery`. | +| `versionQuery` | `string` | No | — | SQL query that returns one row, one text column with the version string. Run as the operator's superuser against the restore. Mutually exclusive with `version`. | +| `versionFallbackToBase` | `bool` | No | `false` | If the manifest URL with the discovered/pinned version 404s, retry with the `major.minor.0` base version. | +| `extensionImage` | `string` | No | `registry.gitlab.com/dalibo/postgresql_anon:latest` | OCI image mounted as an image volume on the restore Pod to source the `anon` extension files. | + +Example (Tamanu): + +```yaml +spec: + redaction: + manifestUrl: "https://docs.data.bes.au/tamanu/v{version}/manifest.json" + versionQuery: "SELECT value FROM local_system_facts WHERE key = 'currentVersion'" + versionFallbackToBase: true +``` + +Notes: + +- Requires PostgreSQL 18+ on the restore (uses the runtime-settable `extension_control_path` and `dynamic_library_path` GUCs to load the extension from the mounted image). +- During redaction the database is writable; once anonymisation completes, the operator sets `default_transaction_read_only = on` at the database level and demotes the analytics user back to non-superuser when `spec.readOnly` is true. + #### SnapshotFilter | Field | Type | Required | Description | @@ -165,6 +194,9 @@ Additional fields for `target: graphQL`: | `schemaMigrationJob` | `string` | Name of the active schema migration Job (set while migration is in progress). | | `schemaMigrationPhase` | `string` | Phase of the schema migration (`active`, `complete`, or `failed: `). | | `persistentSchemaDataSize` | `Quantity` | Measured size of persistent schema data from the last successful migration. Used to size the next restore PVC. | +| `redactionPhase` | `string` | Phase of the current restore's redaction (`active`, `complete`, `partial`, or `failed: `). `partial` means anonymisation ran but some per-column SECURITY LABEL statements were tolerated as errors (e.g. column missing on this DB version). `failed:` is sticky — it doesn't auto-retry; the next scheduled restore clears it. | +| `redactionVersion` | `string` | The manifest version resolved during the last redaction run (when `manifestUrl` is version-templated). | +| `redactionColumnsApplied` | `uint32` | Number of columns the last redaction run attempted to mask. | | `consecutiveRestoreFailures` | `uint32` | Number of consecutive restore failures. Reset to 0 on success. After 3 consecutive failures the operator stops scheduling new restores until the counter is reset (automatically on next successful restore, or manually via `kubectl patch --subresource=status`). | --- From aa82809f7c1f81408fdce4410958792ac5dcd963 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Wed, 13 May 2026 21:13:55 +1200 Subject: [PATCH 07/11] test(redaction): end-to-end integration test for PG-18 redaction MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds an integration test that exercises the whole redaction pipeline against a kind cluster: - tests/fixtures/setup-kopia-repo-pg18.yaml: snapshots a PG 18 db with a Tamanu-shaped local_system_facts table plus users and sync_lookup test tables - tests/fixtures/manifest-server.yaml: nginx + ConfigMap that serves a static dbt-shaped manifest covering email/name/date/phone/ integer-range column masks and table-level truncate - tests/fixtures/Dockerfile.anon-pg18: minimal PG-18 anon extension image built from Dalibo's apt repo (their published :stable is built against PG 16, so we have to roll our own for now) - tests/redaction.rs: drives the workflow and asserts each mask kind took effect, sync_lookup was truncated, unmarked columns are unchanged, read-only is re-enabled, and analytics is demoted from SUPERUSER - .github/workflows/integration.yml: new 'redaction' matrix entry, PG-18 image pre-pull, anon-image build, and PG-18 kopia setup Also drive-by fixes from finding real defaults: the extension default image name was registry.gitlab.com/dalibo/postgresql_anon — the canonical name is .../postgresql_anonymizer:stable. The extension_control_path / dynamic_library_path GUCs now reference the Debian PG layout (.../usr/share/postgresql/N/extension and .../usr/lib/postgresql/N/lib) inside the image mount. --- .github/workflows/integration.yml | 48 ++++ README.md | 3 +- src/controllers/restore/builders.rs | 15 +- src/types/replica.rs | 4 +- tests/fixtures/Dockerfile.anon-pg18 | 34 +++ tests/fixtures/manifest-server.yaml | 75 +++++ tests/fixtures/setup-kopia-repo-pg18.yaml | 138 +++++++++ tests/redaction.rs | 327 ++++++++++++++++++++++ 8 files changed, 635 insertions(+), 9 deletions(-) create mode 100644 tests/fixtures/Dockerfile.anon-pg18 create mode 100644 tests/fixtures/manifest-server.yaml create mode 100644 tests/fixtures/setup-kopia-repo-pg18.yaml create mode 100644 tests/redaction.rs diff --git a/.github/workflows/integration.yml b/.github/workflows/integration.yml index a237c85..fe34f88 100644 --- a/.github/workflows/integration.yml +++ b/.github/workflows/integration.yml @@ -63,6 +63,12 @@ jobs: test-ps-all-missing needs_non_pg_snapshot: false + - name: redaction + namespaces: >- + test-redaction + needs_non_pg_snapshot: false + needs_pg18_snapshot: true + steps: - uses: actions/checkout@v6 @@ -164,6 +170,20 @@ jobs: load_image postgres:16-alpine load_image alpine:latest + if [ "${{ matrix.needs_pg18_snapshot }}" = "true" ]; then + load_image postgres:18 + load_image nginx:alpine + fi + + - name: Build PG-18 anon extension image + if: matrix.needs_pg18_snapshot + run: | + docker build \ + -t test-anon-pg18:integ \ + -f tests/fixtures/Dockerfile.anon-pg18 \ + tests/fixtures/ + kind load docker-image test-anon-pg18:integ + - name: Deploy MinIO run: | kubectl apply -f tests/fixtures/minio.yaml @@ -201,6 +221,34 @@ jobs: kubectl logs job/setup-kopia-repo --all-containers --prefix echo "--- Kopia repository ready ---" + - name: Set up PG-18 kopia snapshot + if: matrix.needs_pg18_snapshot + run: | + kubectl apply -f tests/fixtures/setup-kopia-repo-pg18.yaml + for i in $(seq 1 60); do + STATUS=$(kubectl get job/setup-kopia-repo-pg18 -o jsonpath='{.status.conditions[?(@.type=="Complete")].status}' 2>/dev/null) + FAILED=$(kubectl get job/setup-kopia-repo-pg18 -o jsonpath='{.status.conditions[?(@.type=="Failed")].status}' 2>/dev/null) + if [ "$STATUS" = "True" ]; then + echo "PG-18 setup job completed successfully" + break + fi + if [ "$FAILED" = "True" ]; then + echo "PG-18 setup job failed!" + kubectl describe job/setup-kopia-repo-pg18 + kubectl logs job/setup-kopia-repo-pg18 --all-containers --prefix + exit 1 + fi + if [ "$i" = "60" ]; then + echo "PG-18 setup job timed out after 300s" + kubectl describe job/setup-kopia-repo-pg18 + kubectl logs job/setup-kopia-repo-pg18 --all-containers --prefix 2>/dev/null || true + exit 1 + fi + sleep 5 + done + echo "--- PG-18 setup job logs ---" + kubectl logs job/setup-kopia-repo-pg18 --all-containers --prefix + - name: Set up non-postgres kopia snapshot if: matrix.needs_non_pg_snapshot run: | diff --git a/README.md b/README.md index f3746f1..da1524a 100644 --- a/README.md +++ b/README.md @@ -126,7 +126,7 @@ The manifest follows the [Tamanu masking spec](https://github.com/beyondessentia | `version` | `string` | No | — | Pinned version substituted into `{version}`. Mutually exclusive with `versionQuery`. | | `versionQuery` | `string` | No | — | SQL query that returns one row, one text column with the version string. Run as the operator's superuser against the restore. Mutually exclusive with `version`. | | `versionFallbackToBase` | `bool` | No | `false` | If the manifest URL with the discovered/pinned version 404s, retry with the `major.minor.0` base version. | -| `extensionImage` | `string` | No | `registry.gitlab.com/dalibo/postgresql_anon:latest` | OCI image mounted as an image volume on the restore Pod to source the `anon` extension files. | +| `extensionImage` | `string` | No | `registry.gitlab.com/dalibo/postgresql_anonymizer:stable` | OCI image mounted as an image volume on the restore Pod to source the `anon` extension files. | Example (Tamanu): @@ -141,6 +141,7 @@ spec: Notes: - Requires PostgreSQL 18+ on the restore (uses the runtime-settable `extension_control_path` and `dynamic_library_path` GUCs to load the extension from the mounted image). +- The default `extensionImage` tag (`:stable` on Dalibo's registry) is currently built against PG 16; until Dalibo publishes a PG-18 build, set `extensionImage` to an image with `anon.control` and `anon.so` at the standard Debian PG-18 paths (`/usr/share/postgresql/18/extension/` and `/usr/lib/postgresql/18/lib/`). See `tests/fixtures/Dockerfile.anon-pg18` for a minimal recipe. - During redaction the database is writable; once anonymisation completes, the operator sets `default_transaction_read_only = on` at the database level and demotes the analytics user back to non-superuser when `spec.readOnly` is true. #### SnapshotFilter diff --git a/src/controllers/restore/builders.rs b/src/controllers/restore/builders.rs index cdfef23..abef937 100644 --- a/src/controllers/restore/builders.rs +++ b/src/controllers/restore/builders.rs @@ -707,12 +707,15 @@ cp -a /usr/lib/locale/* /locale-data/ if replica.spec.redaction.is_some() { // PG 18+ uses these path GUCs to load extensions whose files live // outside the system extension directories. The dalibo - // postgresql_anonymizer image is mounted at /extensions/anon by - // the Pod builder below. - extra_config.push_str( - "extension_control_path = '$system:/extensions/anon/share/extension'\n\ - dynamic_library_path = '$libdir:/extensions/anon/lib/postgresql/18/lib'\n", - ); + // postgresql_anonymizer image is a full Debian-based Postgres + // image whose filesystem is mounted at /extensions/anon by the + // Pod builder below — so we point the GUCs at the Debian + // extension layout inside that mount. + let pg_major: i32 = pg_version.parse().unwrap_or(18); + extra_config.push_str(&format!( + "extension_control_path = '$system:/extensions/anon/usr/share/postgresql/{pg_major}/extension'\n\ + dynamic_library_path = '$libdir:/extensions/anon/usr/lib/postgresql/{pg_major}/lib'\n", + )); } let extra_config_block = if extra_config.is_empty() { String::new() diff --git a/src/types/replica.rs b/src/types/replica.rs index eefc46d..0f2e644 100644 --- a/src/types/replica.rs +++ b/src/types/replica.rs @@ -140,12 +140,12 @@ pub struct RedactionSpec { /// Override the OCI image used as the source of the /// postgresql_anonymizer extension files (mounted as an image volume /// on the restore Pod). Defaults to - /// `registry.gitlab.com/dalibo/postgresql_anon:latest`. + /// `registry.gitlab.com/dalibo/postgresql_anonymizer:stable`. #[serde(default, skip_serializing_if = "Option::is_none")] pub extension_image: Option, } -pub const DEFAULT_ANON_IMAGE: &str = "registry.gitlab.com/dalibo/postgresql_anon:latest"; +pub const DEFAULT_ANON_IMAGE: &str = "registry.gitlab.com/dalibo/postgresql_anonymizer:stable"; fn default_storage_size_maximum() -> Quantity { Quantity("2Ti".to_string()) diff --git a/tests/fixtures/Dockerfile.anon-pg18 b/tests/fixtures/Dockerfile.anon-pg18 new file mode 100644 index 0000000..1314c52 --- /dev/null +++ b/tests/fixtures/Dockerfile.anon-pg18 @@ -0,0 +1,34 @@ +# Minimal PG-18 anon extension image for the redaction integration test. +# +# Dalibo's published :stable image is built against PG 16, so for our PG 18 +# restore we install postgresql_anonymizer_18 from Dalibo's apt repo and +# strip the result down to the extension files. The result is a tiny image +# that's mounted at /extensions/anon on the restore Pod via the image- +# volume mechanism. +FROM debian:bookworm-slim AS builder + +RUN apt-get update \ + && apt-get install -y --no-install-recommends \ + curl ca-certificates gnupg lsb-release \ + && rm -rf /var/lib/apt/lists/* + +# PGDG repo for postgresql-server-dev-18 (the anon package's hard dep) +RUN curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc \ + | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg \ + && echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" \ + > /etc/apt/sources.list.d/pgdg.list + +# Dalibo Labs repo for postgresql_anonymizer_18 +RUN curl -fsSL https://apt.dalibo.org/labs/debian-dalibo.gpg \ + -o /etc/apt/trusted.gpg.d/dalibo-labs.gpg \ + && echo "deb http://apt.dalibo.org/labs $(lsb_release -cs)-dalibo main" \ + > /etc/apt/sources.list.d/dalibo-labs.list + +RUN apt-get update \ + && apt-get install -y --no-install-recommends postgresql_anonymizer_18 \ + && rm -rf /var/lib/apt/lists/* + +# scratch destination: only the extension files end up in the final image +FROM scratch +COPY --from=builder /usr/share/postgresql/18/extension/ /usr/share/postgresql/18/extension/ +COPY --from=builder /usr/lib/postgresql/18/lib/anon.so /usr/lib/postgresql/18/lib/anon.so diff --git a/tests/fixtures/manifest-server.yaml b/tests/fixtures/manifest-server.yaml new file mode 100644 index 0000000..3d22d6c --- /dev/null +++ b/tests/fixtures/manifest-server.yaml @@ -0,0 +1,75 @@ +--- +# Static-manifest HTTP server for the redaction integration test. +# +# The harness `kubectl apply`s this into the test namespace; the redacted +# replica's spec.redaction.manifestUrl points at +# http://manifest-server..svc/manifest.json +apiVersion: v1 +kind: ConfigMap +metadata: + name: manifest-server-content +data: + manifest.json: | + { + "sources": { + "source.tamanu.tamanu.users": { + "schema": "public", + "name": "users", + "columns": { + "email": { "config": { "meta": { "masking": "email" } } }, + "full_name": { "config": { "meta": { "masking": "name" } } }, + "single_name": { "config": { "meta": { "masking": "name" } } }, + "dob": { "config": { "meta": { "masking": "date" } } }, + "phone": { "config": { "meta": { "masking": "phone" } } }, + "heart_rate": { "config": { "meta": { "masking": { "kind": "integer", "range": "50-100" } } } } + } + }, + "source.tamanu.tamanu.sync_lookup": { + "schema": "public", + "name": "sync_lookup", + "config": { "meta": { "masking": "truncate" } }, + "columns": {} + } + } + } +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: manifest-server + labels: + app: manifest-server +spec: + replicas: 1 + selector: + matchLabels: + app: manifest-server + template: + metadata: + labels: + app: manifest-server + spec: + containers: + - name: nginx + image: nginx:alpine + imagePullPolicy: IfNotPresent + ports: + - containerPort: 80 + volumeMounts: + - name: content + mountPath: /usr/share/nginx/html + volumes: + - name: content + configMap: + name: manifest-server-content +--- +apiVersion: v1 +kind: Service +metadata: + name: manifest-server +spec: + selector: + app: manifest-server + ports: + - port: 80 + targetPort: 80 diff --git a/tests/fixtures/setup-kopia-repo-pg18.yaml b/tests/fixtures/setup-kopia-repo-pg18.yaml new file mode 100644 index 0000000..f0faa16 --- /dev/null +++ b/tests/fixtures/setup-kopia-repo-pg18.yaml @@ -0,0 +1,138 @@ +--- +apiVersion: batch/v1 +kind: Job +metadata: + name: setup-kopia-repo-pg18 + namespace: default +spec: + backoffLimit: 3 + template: + spec: + securityContext: + fsGroup: 999 + initContainers: + - name: create-bucket + image: minio/mc:latest + imagePullPolicy: IfNotPresent + command: ["/bin/sh", "-c"] + args: + - | + set -e + mc alias set minio http://minio.minio.svc:9000 minioadmin minioadmin + mc mb minio/test-bucket-pg18 --ignore-existing + echo "Bucket created" + - name: init-pgdata + image: postgres:18 + imagePullPolicy: IfNotPresent + securityContext: + runAsUser: 0 + command: ["/bin/sh", "-c"] + args: + - | + set -e + echo "Initializing PostgreSQL 18 data directory..." + mkdir -p /pgdata/18/main + chown postgres:postgres /pgdata/18/main + gosu postgres initdb -D /pgdata/18/main --no-locale --encoding=UTF8 --auth=trust + + echo "Starting temporary PostgreSQL to create application database..." + gosu postgres pg_ctl -D /pgdata/18/main -l /tmp/pg.log start -w + + gosu postgres psql -d postgres -c "CREATE DATABASE myapp" + + # Tamanu-shaped local_system_facts table so the operator's + # versionQuery has something to read. The value is the + # manifest version the test harness will serve. + gosu postgres psql -d myapp <<'SQL' + CREATE TABLE local_system_facts ( + key text PRIMARY KEY, + value text + ); + INSERT INTO local_system_facts (key, value) VALUES ('currentVersion', '1.0.0'); + + -- A users table mirroring the dbt-masking contract: + -- email, name, dob (date) and a non-redacted column + -- to make sure unmarked columns are preserved. + CREATE TABLE users ( + id serial PRIMARY KEY, + email text NOT NULL, + full_name text NOT NULL, + single_name text NOT NULL, + dob date, + phone text, + heart_rate int, + unmasked text NOT NULL + ); + INSERT INTO users (email, full_name, single_name, dob, phone, heart_rate, unmasked) VALUES + ('a@example.com', 'Alice Apple', 'Alice', '1980-01-15', '+64211234567', 70, 'keep-1'), + ('b@example.com', 'Bob Banana', 'Bob', '1975-05-22', '+64211234568', 80, 'keep-2'), + ('c@example.com', 'Carol Cherry', 'Carol', '1990-11-30', '+64211234569', 90, 'keep-3'), + ('d@example.com', 'Dave Date', 'Dave', '1985-07-04', '+64211234570', 60, 'keep-4'), + ('e@example.com', 'Eve Elderberry','Eve', '2000-12-25', '+64211234571', 75, 'keep-5'); + + -- A throwaway table to verify the table-level + -- "truncate" mask works. + CREATE TABLE sync_lookup ( + id serial PRIMARY KEY, + data text NOT NULL + ); + INSERT INTO sync_lookup (data) + SELECT 'row-' || i FROM generate_series(1, 50) AS i; + SQL + echo "Application database 'myapp' created with test data" + + gosu postgres pg_ctl -D /pgdata/18/main stop -w + + echo "PostgreSQL data directory initialized" + ls -la /pgdata/18/main/ + cat /pgdata/18/main/PG_VERSION + volumeMounts: + - name: pgdata + mountPath: /pgdata + containers: + - name: create-snapshot + image: kopia/kopia:0.22.3 + imagePullPolicy: IfNotPresent + command: ["/bin/sh", "-c"] + args: + - | + set -e + + mkdir -p /tmp/kopia/config /tmp/kopia/logs /tmp/kopia/cache + + echo "Creating kopia repository..." + kopia repository create s3 \ + --bucket=test-bucket-pg18 \ + --endpoint=minio.minio.svc:9000 \ + --region=us-east-1 \ + --access-key=minioadmin \ + --secret-access-key=minioadmin \ + --password=test-repo-password \ + --disable-tls \ + --disable-tls-verification + + echo "Creating snapshot..." + kopia snapshot create /pgdata + + echo "Verifying snapshot..." + kopia snapshot list --json --all + + echo "Done" + env: + - name: KOPIA_CONFIG_PATH + value: /tmp/kopia/config/repository.config + - name: KOPIA_LOG_DIR + value: /tmp/kopia/logs + - name: KOPIA_CACHE_DIRECTORY + value: /tmp/kopia/cache + - name: KOPIA_PASSWORD + value: test-repo-password + - name: USER + value: kopia + volumeMounts: + - name: pgdata + mountPath: /pgdata + volumes: + - name: pgdata + emptyDir: {} + restartPolicy: Never diff --git a/tests/redaction.rs b/tests/redaction.rs new file mode 100644 index 0000000..51283b6 --- /dev/null +++ b/tests/redaction.rs @@ -0,0 +1,327 @@ +//! End-to-end redaction integration test. +//! +//! Requires a PG 18 kopia snapshot (see `setup-kopia-repo-pg18.yaml`) +//! and the in-namespace static manifest server (see +//! `manifest-server.yaml`). Both are deployed by the workflow before this +//! test runs. + +use std::{collections::BTreeMap, time::Duration}; + +use k8s_openapi::{ + ByteString, + api::core::v1::{Secret, SecretReference}, + apimachinery::pkg::api::resource::Quantity, +}; +use kube::{ + Api, + api::{ObjectMeta, PostParams}, +}; +use postgres_restore_operator::{ + types::{ + PostgresPhysicalReplica, PostgresPhysicalReplicaSpec, PostgresPhysicalRestore, + RedactionSpec, ReplicaPhase, RestorePhase, + }, + util::TimeSpan, +}; + +use helpers::*; + +mod helpers; + +const NS: &str = "test-redaction"; +const REPLICA_NAME: &str = "redaction-replica"; + +#[tokio::test] +#[ignore = "requires a running Kubernetes cluster with MinIO, PG-18 kopia snapshot and the manifest server"] +async fn redaction_applies_masks_to_restored_data() { + let client = make_client().await; + + setup_namespace(&client, NS).await; + cleanup_namespace(&client, NS, &[REPLICA_NAME]).await; + + println!("--- deploying in-namespace static manifest server"); + deploy_manifest_server(NS).await; + + let secrets: Api = Api::namespaced(client.clone(), NS); + let replicas: Api = Api::namespaced(client.clone(), NS); + let restores: Api = Api::namespaced(client.clone(), NS); + + println!("--- creating kopia secret (PG-18 bucket)"); + secrets + .create( + &PostParams::default(), + &build_pg18_kopia_secret(NS, "redaction-kopia-creds"), + ) + .await + .expect("failed to create kopia secret"); + + println!("--- creating PostgresPhysicalReplica with redaction config"); + let replica = build_redaction_replica(REPLICA_NAME, "redaction-kopia-creds"); + replicas + .create(&PostParams::default(), &replica) + .await + .expect("failed to create replica"); + + println!("--- waiting for first restore to become Active"); + let restore_name = wait_for_restore_phase( + &restores, + REPLICA_NAME, + RestorePhase::Active, + LONG_PHASE_TIMEOUT, + ) + .await; + wait_for_replica_phase( + &replicas, + REPLICA_NAME, + ReplicaPhase::Ready, + LONG_PHASE_TIMEOUT, + ) + .await; + println!("--- restore {restore_name} active"); + + // At this point the operator has already applied redaction (because + // the restore wouldn't transition Switching -> Active otherwise). + let final_replica = replicas.get(REPLICA_NAME).await.unwrap(); + let status = final_replica.status.as_ref().expect("status set"); + let phase = status.redaction_phase.as_deref(); + assert!( + matches!(phase, Some("complete") | Some("partial")), + "redactionPhase should be complete or partial, got {phase:?}" + ); + let version = status.redaction_version.as_deref(); + assert_eq!( + version, + Some("1.0.0"), + "manifest version should be read from local_system_facts" + ); + let cols = status.redaction_columns_applied.unwrap_or(0); + assert!( + cols >= 6, + "expected at least 6 columns redacted, got {cols}" + ); + + let deploy = format!("deployment/{restore_name}"); + + println!("--- verifying truncate mask emptied sync_lookup"); + let count = query_one_value(NS, &deploy, "SELECT count(*) FROM sync_lookup").await; + assert_eq!(count.trim(), "0", "sync_lookup should be truncated"); + + println!("--- verifying unmasked column kept original values"); + let unmasked = query_one_value( + NS, + &deploy, + "SELECT string_agg(unmasked, ',' ORDER BY id) FROM users", + ) + .await; + assert_eq!(unmasked.trim(), "keep-1,keep-2,keep-3,keep-4,keep-5"); + + println!("--- verifying email column was changed"); + let emails = query_one_value( + NS, + &deploy, + "SELECT string_agg(email, ',' ORDER BY id) FROM users", + ) + .await; + assert!( + !emails.contains("a@example.com"), + "original email should be masked, got: {emails}" + ); + assert!( + emails.contains('@'), + "masked email should still look like an email, got: {emails}" + ); + + println!("--- verifying name masks: full names (with space) and single names"); + // full_name has spaces in the original; mask preserves the space-pattern via + // CASE-WHEN: full names get fake_name(), single names get fake_first_name(). + let full_names = query_one_value( + NS, + &deploy, + "SELECT string_agg(full_name, '|' ORDER BY id) FROM users", + ) + .await; + assert!( + !full_names.contains("Alice Apple"), + "full_name should be masked, got: {full_names}" + ); + let single_names = query_one_value( + NS, + &deploy, + "SELECT string_agg(single_name, '|' ORDER BY id) FROM users", + ) + .await; + assert!( + !single_names.contains("Alice"), + "single_name should be masked, got: {single_names}" + ); + + println!("--- verifying date mask changed dob"); + let dobs = query_one_value( + NS, + &deploy, + "SELECT string_agg(dob::text, ',' ORDER BY id) FROM users", + ) + .await; + assert!( + !dobs.contains("1980-01-15"), + "dob should be masked, got: {dobs}" + ); + + println!("--- verifying phone mask preserves prefix/suffix"); + let phones = query_one_value( + NS, + &deploy, + "SELECT string_agg(phone, ',' ORDER BY id) FROM users", + ) + .await; + // anon.partial(phone, 2, '****', 2) keeps first 2 and last 2 chars + assert!( + phones.contains("+6") && phones.contains("****"), + "phone should be partial-masked with ****, got: {phones}" + ); + + println!("--- verifying integer-range mask kept values in [50, 100]"); + let out_of_range = query_one_value( + NS, + &deploy, + "SELECT count(*) FROM users WHERE heart_rate < 50 OR heart_rate > 100", + ) + .await; + assert_eq!( + out_of_range.trim(), + "0", + "heart_rate should stay in 50..100" + ); + + println!("--- verifying read-only was re-enabled"); + let setting = query_one_value(NS, &deploy, "SHOW default_transaction_read_only").await; + assert_eq!( + setting.trim(), + "on", + "default_transaction_read_only should be on" + ); + + println!("--- verifying analytics role was demoted from SUPERUSER"); + let rolsuper = query_one_value( + NS, + &deploy, + "SELECT rolsuper::text FROM pg_roles WHERE rolname = 'analytics'", + ) + .await; + assert_eq!( + rolsuper.trim(), + "false", + "analytics user should no longer be SUPERUSER" + ); + + println!("--- all redaction assertions passed"); +} + +async fn deploy_manifest_server(ns: &str) { + let status = tokio::process::Command::new("kubectl") + .args([ + "apply", + "-n", + ns, + "-f", + "tests/fixtures/manifest-server.yaml", + ]) + .status() + .await + .expect("failed to run kubectl apply"); + assert!(status.success(), "kubectl apply for manifest server failed"); + + // Wait briefly for the Service endpoint to come up. Best-effort — + // the in-cluster DNS resolves the Service name even before the + // backing pod is Ready, and the operator's reqwest call retries + // on transient connection failures via the redaction failed:* path. + tokio::time::sleep(Duration::from_secs(5)).await; +} + +fn build_pg18_kopia_secret(ns: &str, name: &str) -> Secret { + Secret { + metadata: ObjectMeta { + name: Some(name.into()), + namespace: Some(ns.into()), + ..Default::default() + }, + data: Some(BTreeMap::from([ + ("bucket".into(), ByteString("test-bucket-pg18".into())), + ("region".into(), ByteString("us-east-1".into())), + ("accessKeyId".into(), ByteString("minioadmin".into())), + ("secretAccessKey".into(), ByteString("minioadmin".into())), + ( + "repositoryPassword".into(), + ByteString("test-repo-password".into()), + ), + ("endpoint".into(), ByteString("minio.minio.svc:9000".into())), + ("disableTls".into(), ByteString("true".into())), + ])), + ..Default::default() + } +} + +fn build_redaction_replica(name: &str, secret_ref: &str) -> PostgresPhysicalReplica { + let mut replica = PostgresPhysicalReplica::new( + name, + PostgresPhysicalReplicaSpec { + kopia_secret_ref: SecretReference { + name: Some(secret_ref.into()), + namespace: None, + }, + snapshot_filter: None, + schedule: "0 */6 * * *".into(), + schedule_jitter: Default::default(), + minimum_ttl: None, + switchover_grace_period: TimeSpan(jiff::Span::new().seconds(10)), + analytics_username: "analytics".into(), + storage_class: None, + storage_size_override: None, + resources: None, + service_annotations: None, + pod_annotations: None, + affinity: None, + tolerations: vec![], + read_only: true, + postgres_extra_config: None, + notifications: vec![], + persistent_schemas: None, + redaction: Some(RedactionSpec { + manifest_url: format!("http://manifest-server.{NS}.svc/manifest.json"), + version: None, + version_query: Some( + "SELECT value FROM local_system_facts WHERE key = 'currentVersion'".into(), + ), + version_fallback_to_base: false, + // The default registry.gitlab.com/.../postgresql_anonymizer:stable + // is built against PG 16 today; the workflow builds a tiny + // PG-18 anon image from Dalibo's apt repo (see + // tests/fixtures/Dockerfile.anon-pg18) and `kind load`s it + // under this tag. + extension_image: Some("test-anon-pg18:integ".into()), + }), + storage_size_maximum: Quantity("2Ti".into()), + }, + ); + replica.metadata.namespace = Some(NS.into()); + replica +} + +async fn query_one_value(ns: &str, deploy: &str, sql: &str) -> String { + kubectl_exec( + ns, + deploy, + &[ + "psql", + "-U", + "analytics", + "-d", + "myapp", + "-t", + "-A", + "-c", + sql, + ], + ) + .await +} From ad78aebd88261c957bf3cc8e42d430e9346b5e25 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Wed, 13 May 2026 21:14:27 +1200 Subject: [PATCH 08/11] unplan: replica redaction via dbt-masking manifests --- docs/plans/replica-redaction.md | 315 -------------------------------- 1 file changed, 315 deletions(-) delete mode 100644 docs/plans/replica-redaction.md diff --git a/docs/plans/replica-redaction.md b/docs/plans/replica-redaction.md deleted file mode 100644 index de83a37..0000000 --- a/docs/plans/replica-redaction.md +++ /dev/null @@ -1,315 +0,0 @@ -# Replica redaction via dbt-masking manifests - -## Context - -In `~/code/work/bestool`, the `tamanu psql` command loads a per-version "redaction method" that flags which columns in a Tamanu database hold sensitive data. Today it is **display-only**: bestool fetches a dbt manifest from `https://docs.data.bes.au/tamanu/v{version}/manifest.json`, parses out `source..columns..config.meta.masking`, and renders matching cells as `"***"` in psql output. The database itself is never modified. - -We want PGRO to be able to produce a **redacted replica** — i.e. a restore whose underlying data has actually been anonymised, so any consumer (analytics tools, sandboxes, dev environments) connecting to that replica's Service sees masked data regardless of how they query it. - -The Tamanu-on-bes.au setup is one consumer, but the operator should be **generic over the source of the manifest**: any user who publishes a dbt manifest with the same `meta.masking` annotation schema can point a replica at it. PGRO knows about *the manifest schema*, not about Tamanu, not about bes.au, not about `local_system_facts`. The Tamanu deployment just becomes a particular configuration of the generic feature. - -Approach (per user decisions in planning): - -- Use the **`postgresql_anonymizer`** Postgres extension to do the masking. The manifest provides the *list* of columns to mask, not *how* to mask them; `anon` provides the masking functions (`anon.fake_email()`, `anon.partial(...)`, `anon.random_*`, etc.). -- Drive the redaction step **from inside the operator**, not via a Job. We have `tokio-postgres` and `reqwest` already; the work is fetch-manifest + run-some-SQL, and AGENTS.md prefers operator-driven over scripted-in-Job. -- After redaction completes, **re-enable read-only** so the analytics user can't write to a redacted replica. - -The redaction step plugs into the existing replica lifecycle alongside `persistent_schemas`: restore reaches Ready → redact → schema-migrate → switchover. - -## Manifest schema (the contract) - -PGRO consumes a dbt-shaped JSON document at an HTTP URL. The minimum shape it relies on: - -```json -{ - "sources": { - "": { - "schema": "", - "name": "", - "columns": { - "": { - "config": { "meta": { "masking": } } - } - } - } - } -} -``` - -Each source must carry explicit `schema` and `name` (the Tamanu dbt manifest always emits them — 163/163 sources in the v2.54.3 manifest). Sources missing either field are skipped with a warning. Source keys are otherwise opaque to PGRO. - -### `meta.masking` — canonical contract - -The Tamanu masking spec is documented at https://github.com/beyondessential/tamanu/tree/main/database#masking. It is deliberately implementation-agnostic — descriptions are vague about exact behaviour so implementations (bestool's display-only one, this DB-side one, future others) can vary. The bits PGRO has to honour: - -- **Short and extended form are equivalent.** `masking: name` ≡ `masking: { kind: name }`. The extended form *must* carry `kind`; it may carry additional parameters (currently only `range`). -- **Nulls are preserved.** When a column value is `NULL` it must stay `NULL` after masking. -- **Two locations.** Column-level masks live under `sources..columns..config.meta.masking`. Table-level masks (currently only `truncate`) live under `sources..config.meta.masking` (and `sources..meta.masking` mirrors it). PGRO reads both paths. - -#### Canonical kinds (full list per the docs) - -| kind | scope | behaviour | proposed implementation | -|---|---|---|---| -| `truncate` | **table** | Empty the entire table. | `TRUNCATE TABLE schema.table` as superuser, before the column-mask pass. Not a `SECURITY LABEL`. | -| `date` | column | Anonymise across different dates. Works on `date`/`timestamp(tz)` and on text representations like `character(10)`. | `MASKED WITH FUNCTION anon.random_date()` (or `random_date_between` if we want bounded). Wrap in `CASE WHEN IS NULL THEN NULL ELSE … END` to preserve nulls. | -| `datetime` | column | Anonymise the time-of-day while preserving the date component. Works on `timestamp(tz)` and on text representations like `character(19)`. | Compose: keep `date_trunc('day', )` and add a random interval of seconds. Specifically `date_trunc('day', ) + (floor(random() * 86400) || ' seconds')::interval` (cast as needed for text columns). Null-preserved via CASE. | -| `text` | column | Random words/sentences, approximately the same length as the original. | `anon.lorem_ipsum(characters := length())` (or `words` derived from `length()/6`). | -| `string` | column | Random printable ASCII, no spaces, approximately the same length as the original. | `anon.random_string(length())` if the function accepts a dynamic length, else fall back to a fixed length and accept the deviation. | -| `email`, `name`, `phone`, `place`, `url` | column | Fake data of the indicated shape. | `anon.fake_email()` / `anon.fake_first_name()` / a `partial(, 2, '****', 2)`-style call / `anon.fake_city()` / a constructed URL respectively. For `name`, the docs ask us to inspect whether the original contains a space and use full vs single name — implement with `CASE WHEN LIKE '% %' THEN anon.fake_name() ELSE anon.fake_first_name() END`. | -| `zero` | column | Keep the data length identical but replace with zeroes. Primary use: `bytea`. | Type-dispatched (see "Type-aware planning" below). For `bytea`: `repeat(E'\\x00'::bytea, length())`. For text types: `repeat('0', length())`. For numeric types: `0`. | -| `empty` | column | Delete the value without nulling: `0` for numbers, `''` for strings, `{}` for json(b), `[]` for arrays, etc. | Type-dispatched. The redaction module looks up each masked column's `data_type` in `information_schema.columns` and emits the appropriate `MASKED WITH VALUE …`. | -| `nil` | column | Null the field. The docs note it only applies to nullable columns. | `MASKED WITH VALUE NULL`. Operator skips columns where `is_nullable = 'NO'` and records the skip as a tolerated error. | -| `default` | column | Set the column to its `DEFAULT` value. The docs note it only applies to columns that have a default. | At planning time, look up `pg_get_expr(adbin, adrelid)` from `pg_attrdef` for the column. If present, emit `MASKED WITH VALUE `. If absent, tolerated error. | -| `integer` | column | Random integer. Optional `range: "L-H"` constrains the output. | `floor(random() * (H - L + 1) + L)::int` (or `anon.random_int_between(L, H)` if available). Default range `int4` if unspecified. | -| `float` | column | Random float. Optional `range: "L-H"` constrains. | `(random() * (H - L) + L)::numeric` (or `anon.random_in_numrange('[L,H]'::numrange)`). Default unbounded if unspecified. | -| `money` | column | Like `float`/`integer`, but the value is generated as a float with two decimals for `numeric` columns. | Same as `float`, then `round(, 2)`. | - -#### Range parameter parsing - -`range: "L-H"` is two numbers joined by a hyphen. Parse by splitting on the **last** `-` so floats like `1.001-1.03` decompose correctly. Both halves must parse as `f64`; parse failures are tolerated errors → fall back to the unbounded variant. Per the docs example (`kind: integer, range: 0-10.5`), the operator accepts a decimal range for `kind: integer` and rounds. - -#### Type-aware planning - -For `zero`, `empty`, and `default`, the operator can't decide the right `MASKED WITH VALUE` (or function) without knowing the column type / default. So before issuing any `SECURITY LABEL`, the redaction module runs a single batch query against `information_schema.columns` (and `pg_attrdef` for `default`) to resolve `data_type` and `column_default` for every (schema, table, column) tuple it's about to mask. Columns not present in the DB are dropped (tolerated error). The mapping decisions for those three kinds use this metadata. - -#### Unknown kinds - -If a future manifest version introduces a kind PGRO doesn't recognise (the spec is open-ended), the affected columns are dropped with a tolerated error and the run is reported as `partial`. Adding a new kind is a code change. - -## High-level design - -A new optional `redaction` field on `PostgresPhysicalReplicaSpec`. When set, after a new restore reaches the `Ready` phase the operator runs the redaction step before the restore is eligible for switchover. The step is tracked in status as `redactionPhase` (`pending` → `active` → `complete` / `partial` / `failed`), mirroring how `schemaMigrationPhase` works today. - -Order during a switchover cycle (new restore N replacing active A): -1. N restored from snapshot → `Ready`. -2. **Redaction** runs against N. While running, `default_transaction_read_only` is off on N (we need writes). On success the operator sets it back on via `ALTER DATABASE ... SET default_transaction_read_only = on`. -3. `persistent_schemas` migration A→N (existing behaviour). The schema migration job already runs against N as superuser, so read-only at the DB-default level doesn't block it (superuser is exempt by SET ROLE; `default_transaction_read_only` is a session default, not a hard lock). -4. Switchover Service → N, grace period on A, sweep. - -Redaction runs *before* schema migration so that dbt-style views in persistent schemas can be regenerated against already-redacted source tables on the next dbt run. - -## CRD changes - -`src/types/replica.rs` — add to `PostgresPhysicalReplicaSpec`: - -```rust -/// If set, apply a redaction manifest to the restored data before the -/// replica becomes eligible for switchover. Requires Postgres 18+ and -/// the postgresql_anonymizer extension (loaded via image-volume mount, -/// see plan section "Postgres version gate and extension loading"). -#[serde(default, skip_serializing_if = "Option::is_none")] -pub redaction: Option, -``` - -```rust -#[derive(Debug, Clone, Deserialize, Serialize, JsonSchema)] -#[serde(rename_all = "camelCase")] -pub struct RedactionSpec { - /// HTTP(S) URL of the dbt-style masking manifest. May contain a - /// `{version}` placeholder, in which case `version` or - /// `versionQuery` must be set. - pub manifest_url: String, - - /// Pinned version to substitute into `{version}`. Mutually exclusive - /// with `versionQuery`. - #[serde(default, skip_serializing_if = "Option::is_none")] - pub version: Option, - - /// SQL query that returns a single text column with the version - /// string. Run against the restore's main database as the operator's - /// superuser. Mutually exclusive with `version`. - /// - /// Example (Tamanu): `SELECT value FROM local_system_facts WHERE key = 'currentVersion'` - #[serde(default, skip_serializing_if = "Option::is_none")] - pub version_query: Option, - - /// If the manifest URL with the discovered/pinned version 404s, - /// retry with the major.minor.0 base version. Useful when manifests - /// are only published for minor releases. Defaults to false. - #[serde(default)] - pub version_fallback_to_base: bool, - - /// Override the OCI image used as the source of the - /// postgresql_anonymizer extension files (mounted as an image - /// volume on the restore Pod). Defaults to - /// `registry.gitlab.com/dalibo/postgresql_anon:latest`. - #[serde(default, skip_serializing_if = "Option::is_none")] - pub extension_image: Option, -} -``` - -Validation: if `manifestUrl` contains `{version}`, exactly one of `version` or `versionQuery` must be set; otherwise both must be unset. The operator rejects malformed spec at reconcile time (no admission webhook today). - -Status additions to `PostgresPhysicalReplicaStatus`: - -```rust -/// Phase of redaction: pending, active, complete, partial, failed. -#[serde(default, skip_serializing_if = "Option::is_none")] -pub redaction_phase: Option, - -/// Resolved manifest version used in the last redaction run. -#[serde(default, skip_serializing_if = "Option::is_none")] -pub redaction_version: Option, - -/// Number of columns redacted in the last run. -#[serde(default, skip_serializing_if = "Option::is_none")] -pub redaction_columns_applied: Option, -``` - -Update the readme CRD tables (AGENTS.md exception explicitly allows this). - -### Example (Tamanu) - -```yaml -spec: - redaction: - manifestUrl: "https://docs.data.bes.au/tamanu/v{version}/manifest.json" - versionQuery: "SELECT value FROM local_system_facts WHERE key = 'currentVersion'" - versionFallbackToBase: true -``` - -### Example (pinned) - -```yaml -spec: - redaction: - manifestUrl: "https://example.com/redactions/manifest.json" -``` - -## New module: `src/controllers/replica/redaction.rs` - -Single module exposing: - -```rust -pub async fn reconcile_redaction( - ctx: &Context, - replica: &PostgresPhysicalReplica, - restore_name: &str, -) -> Result; -``` - -Internals: - -1. **Resolve version**. - - If `spec.redaction.version` is set, use it. - - Else if `spec.redaction.versionQuery` is set, connect to the restore as superuser and run the query. Expect a single row with a single text column; error clearly on shape mismatch. - - Else if `manifestUrl` contains `{version}`, fail at validation (covered in CRD validation above). - - Else (no `{version}` in URL), no version is needed. - -2. **Fetch the manifest** via `ctx.http_client` (existing `reqwest::Client`). No caching: redaction runs at most once per restore (i.e. once per scheduled cycle, on the order of hours), so the bandwidth/time saved by caching is negligible against the risk of invalidation bugs. If `versionFallbackToBase` is true and the first fetch is a 404, retry with `{major}.{minor}.0` derived from a `MAJOR.MINOR.PATCH`-shaped version (mirroring bestool's `get_base_version`); silently no-op the fallback for versions that don't match that shape. - -3. **Parse the manifest** with `serde_json::Value`. Read each source's `schema` and `name`. Then collect masks: - - **Table-level**: if `sources..config.meta.masking` (or `meta.masking` as fallback) is `truncate`, add a `TableMask::Truncate` for that source. - - **Column-level**: iterate `sources..columns.*` and collect those with a non-null `config.meta.masking`. Normalise short-form (`"name"`) to extended (`{ "kind": "name" }`). Return: - - ```rust - struct ColumnMask { - schema: String, - table: String, - column: String, - kind: String, // "email", "date", "integer", ... - range: Option<(f64, f64)>, // parsed from "L-H", last-dash split - } - - enum TableMask { Truncate { schema: String, table: String } } - ``` - - Unknown kinds and malformed ranges are kept (carrying their kind string) and rejected later, not at parse time, so the operator can count them as tolerated errors with useful context. - -4. **Type-aware planning**. Open the superuser connection (used in step 5). For every `ColumnMask`, look up `data_type`, `is_nullable`, and `column_default` via a single batch query against `information_schema.columns` LEFT JOIN `pg_attrdef`. Drop entries where the column doesn't exist (tolerated error). The kinds `zero`, `empty`, and `default` use this metadata to choose their `MASKED WITH VALUE …` / function expression; `nil` uses `is_nullable` to skip non-nullable columns; everything else ignores the type. - -5. **Resolve each `ColumnMask` → SECURITY LABEL fragment** per the canonical table. The function returns either: - - `MASKED WITH VALUE ` for kinds that resolve to a constant or pg-default expression (`nil`, `empty`, `default`, `zero`-on-numeric), or - - `MASKED WITH FUNCTION ` for kinds that need a function call (most fakes, `text`, `string`, `datetime`-arithmetic, `integer`/`float`/`money`). - For column kinds where "nulls preserved" matters (most of them; not `nil`/`empty`/`default` which intentionally overwrite nulls per the docs), wrap the expression in `CASE WHEN IS NULL THEN NULL ELSE END`. - -6. **Apply masking via the extension**: - - Open a tokio-postgres connection to the restore as superuser, against the main user database (use `postgres::discover_restore_database`). - - `CREATE EXTENSION IF NOT EXISTS anon CASCADE;` (pulls in pgcrypto). - - `SELECT anon.init();` (loads the fake-data tables; idempotent). - - For each `TableMask::Truncate`: `TRUNCATE TABLE {quote_ident(schema)}.{quote_ident(table)};` Tolerated errors counted. - - For each `ColumnMask` (after type-aware planning): emit the `SECURITY LABEL FOR anon ON COLUMN …` statement. Tolerated errors counted. - - `SELECT anon.anonymize_database();` — destructive in-place rewrite of all labelled columns. - - Leave the extension installed (don't `DROP EXTENSION`). The SECURITY LABELs and the fake-data tables stay around (~7 MB) so analytics consumers can call anon functions themselves if useful. Dynamic masking (`MASKED ROLE`) is **not** enabled — that would need `shared_preload_libraries = 'anon'` and a role-grant step, and is out of scope. - -7. **Re-enable read-only**: - - `ALTER DATABASE {quote_ident(dbname)} SET default_transaction_read_only = on;` — applies to all new sessions on that DB. - - If the wider `spec.read_only` is true and `persistent_schemas` is also set, the existing override (line 686 of `restore/builders.rs`) forces postgresql.conf to `off` cluster-wide; the ALTER DATABASE-level setting still takes effect for non-superuser sessions because it's applied last in the GUC resolution order. The schema-migration Job runs as superuser, so it isn't blocked. Verify this assumption during implementation; if it doesn't hold, fall back to issuing `SELECT pg_reload_conf();` after rewriting postgresql.conf inline. - -Returns `RedactionOutcome { version: Option, columns_attempted, columns_failed }`. - -Errors propagate; the caller writes the status patch. - -## Reconciler wiring - -`src/controllers/replica.rs`: - -- After a new restore reaches `Ready`, before the switchover branch: - - Check the restore's PG version (already populated in `status.postgresVersion`). If `spec.redaction.is_some()` and the version is < 18, set phase `"failed: redaction requires PostgreSQL 18+"` and skip switchover. - - If `spec.redaction.is_some()` and `status.redaction_phase != Some("complete")` and `!= Some("partial")`: - - Set phase `"active"` in status. - - Call `redaction::reconcile_redaction(ctx, replica, &new_restore_name)`. - - On Ok: set phase `"complete"` or `"partial"` (depending on per-statement error count) and store `redaction_version` + `redaction_columns_applied`. - - On Err: set phase `"failed: {msg}"` and return early so the reconciler retries. - - The new restore is not eligible to become the switchover target until phase is `complete` or `partial`. -- Schema migration (`reconcile_schema_migration`) gates on redaction completing first — extend its early-return check so it doesn't kick off until `redaction_phase` is settled (when redaction is configured). -- On every new restore created by the schedule, `redaction_phase` resets to `None` (so the next restore re-runs redaction). - -## Postgres version gate and extension loading - -Redaction is **PG 18+ only**. PG 18 introduces `extension_control_path` and `dynamic_library_path` as runtime-settable GUCs, which lets us mount the extension files via a Kubernetes image volume instead of having to ship a custom Postgres image with anon pre-baked. - -Two enforcement points: - -1. **CRD-level rejection at reconcile time**: when `spec.redaction` is set and the restore's `status.postgresVersion` resolves to anything < 18 (or the cluster's discovered PG version from the snapshot is < 18), set `status.redactionPhase = "failed: redaction requires PostgreSQL 18+"` and refuse the switchover. Don't try to silently bump versions or fall back. - -2. **Extension availability**: when redaction is configured, the operator mounts the postgresql_anonymizer extension files into the restore Pod via a Kubernetes [image volume](https://kubernetes.io/docs/concepts/storage/volumes/#image) (Kubernetes 1.34 is in use, so the feature is GA-available). The restore Pod builder: - - - Adds a volume `image: ` mounted at `/extensions/anon`. Default image: `registry.gitlab.com/dalibo/postgresql_anon:latest`. - - Appends to the generated postgresql.conf: - ``` - extension_control_path = '$system:/extensions/anon/share' - dynamic_library_path = '$libdir:/extensions/anon/lib' - ``` - These GUCs were introduced in PG 18, which is why redaction is gated to PG 18+. - -The redaction reconciler then runs `CREATE EXTENSION anon CASCADE` once Postgres is up; the extension files are already on disk thanks to the volume mount. - -## Tests - -- Unit tests in `redaction.rs`: - - `parse_manifest` round-trip: short-form (`"name"`) and extended-form (`{"kind":"integer","range":"20-50"}`) both normalised to the same `ColumnMask`; table-level `truncate` recognised; missing-schema/missing-name source skipped; unknown kind preserved verbatim. - - `base_version` fallback derivation (mirror bestool's `get_base_version` cases). - - `range` parsing: last-dash split handles `1.001-1.03`, integer rounding for `0-10.5`, parse failures fall back to unbounded. - - Fragment building for each canonical kind, including the type-dispatched ones (`zero` / `empty` / `default` against fixture `data_type` and `column_default` lookups), the `name` space-detection CASE, and the null-preserving CASE wrappers. -- Integration test under `tests/` — **deferred to a follow-up**. The existing kopia-repo fixture snapshots PG 16; redaction requires PG 18+. Landing the test means adding a `setup-kopia-repo-pg18.yaml` fixture, a sample-manifest HTTP server fixture, pre-pulling `postgres:18` and the anon extension image onto the kind node, and a new matrix entry in `.github/workflows/integration.yml`. Each step is straightforward but the combined surface is large enough to be its own change. - -## Files to touch - -| File | Change | -|---|---| -| `src/types/replica.rs` | Add `redaction` to spec (`RedactionSpec` struct); `redaction_phase`/`redaction_version`/`columns_applied` to status. | -| `src/controllers/replica/redaction.rs` (new) | Whole module: manifest fetch/parse, mask-instruction parsing + registry, SQL application, PG-18 version check. | -| `src/controllers/replica.rs` | Wire `reconcile_redaction` into the post-Ready, pre-switchover branch; gate switchover on redaction phase; reset phase on new restore. PG-version gate. | -| `src/controllers/replica/schema_migration.rs` | Make `reconcile_schema_migration` wait for redaction-complete when redaction is configured. | -| `src/controllers/restore/builders.rs` | When `spec.redaction.is_some()`, inject the postgresql_anonymizer image volume into the restore Pod and append `extension_control_path` / `dynamic_library_path` to postgresql.conf. Refuse to build the restore Pod with PG < 18 when redaction is set. | -| `src/controllers/postgres.rs` | No change — `connect_to_restore`, `discover_restore_database`, `quote_ident` already cover what we need. | -| `src/context.rs` | No change. | -| `Cargo.toml` | No new deps (uses existing `tokio-postgres`, `reqwest`, `serde_json`). | -| `README.md` | Update CRD tables only (AGENTS.md explicit exception). | -| `.github/workflows/integration.yml` | Add matrix entry for the new integration test file. | - -## Verification - -- `cargo clippy` and `cargo fmt` clean per AGENTS.md. -- Unit tests pass. -- End-to-end against a real cluster: create a `PostgresPhysicalReplica` with the Tamanu `redaction:` example above, observe the new restore reaches `Ready`, then `redactionPhase` transitions `active` → `complete`, then schema migration runs, then switchover. Connect as the analytics user and confirm: - - The flagged columns return masked values. - - `SELECT pg_settings WHERE name = 'default_transaction_read_only'` is `on` for a fresh analytics session. - - An attempted `INSERT` as analytics user is rejected. - -## Open items / follow-ups - -- **Default extension image tag** — the plan defaults `extensionImage` to `registry.gitlab.com/dalibo/postgresql_anon:latest`. `:latest` is brittle; users who want stability should pin a tag in their spec. Worth revisiting the default to a pinned digest once we know which dalibo build works against the Tamanu manifest in practice. -- **New canonical kinds** — the Tamanu masking spec is intentionally open-ended. If a future manifest introduces a new kind, redaction will report it as a tolerated error and complete as `partial`; adding support is a code change. -- **anon function names** — the SQL fragments above use plausible-but-not-verified names from postgresql_anonymizer. During implementation, validate each against the dalibo docs (or `\df anon.*` in an installed instance) and adjust. Particular ones to double-check: `random_in_int4range` vs `random_int_between`, whether `random_string` accepts a dynamic length, whether `lorem_ipsum` has a `characters :=` parameter, whether `random_date_between` exists. From d6c00d7166ab3c36fcd6ec93381501198a146d34 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Wed, 13 May 2026 23:12:29 +1200 Subject: [PATCH 09/11] refactor(redaction): stage anon onto restore PVC instead of an image volume Replaces the image-volume mechanism with an install-anon init container that apt-installs postgresql_anonymizer_$N from Dalibo Labs and stages the extension files under /pgdata/extensions/anon on the restore PVC. Why: the published dalibo image is built against PG 16, and shipping a pre-built PG-18 image (or one per PG major) is operationally heavy. apt install completes in ~30s and runs once per restore, which is negligible against a 20-minute restore. The install is idempotent on pod restarts. Drops the extensionImage spec field, DEFAULT_ANON_IMAGE constant, and the Dockerfile.anon-pg18 / Docker build step from CI. Restore PG version gate stays at 18+ because extension_control_path is a PG 18 GUC; older PG would need files overlaid into the system extension dirs at pod start, which is a separate change. --- .github/workflows/integration.yml | 9 - README.md | 5 +- src/controllers/replica/redaction.rs | 1 - src/controllers/restore/builders.rs | 274 ++++++++++++++++----------- src/controllers/restore/tests.rs | 42 ++-- src/types/replica.rs | 9 - tests/fixtures/Dockerfile.anon-pg18 | 34 ---- tests/redaction.rs | 6 - 8 files changed, 188 insertions(+), 192 deletions(-) delete mode 100644 tests/fixtures/Dockerfile.anon-pg18 diff --git a/.github/workflows/integration.yml b/.github/workflows/integration.yml index fe34f88..c515331 100644 --- a/.github/workflows/integration.yml +++ b/.github/workflows/integration.yml @@ -175,15 +175,6 @@ jobs: load_image nginx:alpine fi - - name: Build PG-18 anon extension image - if: matrix.needs_pg18_snapshot - run: | - docker build \ - -t test-anon-pg18:integ \ - -f tests/fixtures/Dockerfile.anon-pg18 \ - tests/fixtures/ - kind load docker-image test-anon-pg18:integ - - name: Deploy MinIO run: | kubectl apply -f tests/fixtures/minio.yaml diff --git a/README.md b/README.md index da1524a..0e2c96a 100644 --- a/README.md +++ b/README.md @@ -126,7 +126,6 @@ The manifest follows the [Tamanu masking spec](https://github.com/beyondessentia | `version` | `string` | No | — | Pinned version substituted into `{version}`. Mutually exclusive with `versionQuery`. | | `versionQuery` | `string` | No | — | SQL query that returns one row, one text column with the version string. Run as the operator's superuser against the restore. Mutually exclusive with `version`. | | `versionFallbackToBase` | `bool` | No | `false` | If the manifest URL with the discovered/pinned version 404s, retry with the `major.minor.0` base version. | -| `extensionImage` | `string` | No | `registry.gitlab.com/dalibo/postgresql_anonymizer:stable` | OCI image mounted as an image volume on the restore Pod to source the `anon` extension files. | Example (Tamanu): @@ -140,8 +139,8 @@ spec: Notes: -- Requires PostgreSQL 18+ on the restore (uses the runtime-settable `extension_control_path` and `dynamic_library_path` GUCs to load the extension from the mounted image). -- The default `extensionImage` tag (`:stable` on Dalibo's registry) is currently built against PG 16; until Dalibo publishes a PG-18 build, set `extensionImage` to an image with `anon.control` and `anon.so` at the standard Debian PG-18 paths (`/usr/share/postgresql/18/extension/` and `/usr/lib/postgresql/18/lib/`). See `tests/fixtures/Dockerfile.anon-pg18` for a minimal recipe. +- Requires PostgreSQL 18+ on the restore (uses the runtime-settable `extension_control_path` and `dynamic_library_path` GUCs). +- An `install-anon` init container apt-installs `postgresql_anonymizer_$N` from [Dalibo Labs](https://apt.dalibo.org/labs/) and stages the extension files onto the restore PVC under `/pgdata/extensions/anon/`. This adds roughly 30 seconds to each restore, and avoids needing a pre-built per-PG-version extension image. The install is idempotent across pod restarts (skipped if the files are already staged). - During redaction the database is writable; once anonymisation completes, the operator sets `default_transaction_read_only = on` at the database level and demotes the analytics user back to non-superuser when `spec.readOnly` is true. #### SnapshotFilter diff --git a/src/controllers/replica/redaction.rs b/src/controllers/replica/redaction.rs index 1f626c6..952ab16 100644 --- a/src/controllers/replica/redaction.rs +++ b/src/controllers/replica/redaction.rs @@ -335,7 +335,6 @@ mod tests { version: ver.map(str::to_string), version_query: vq.map(str::to_string), version_fallback_to_base: false, - extension_image: None, } } diff --git a/src/controllers/restore/builders.rs b/src/controllers/restore/builders.rs index abef937..ec09f2c 100644 --- a/src/controllers/restore/builders.rs +++ b/src/controllers/restore/builders.rs @@ -705,18 +705,61 @@ cp -a /usr/lib/locale/* /locale-data/ extra_config.push('\n'); } if replica.spec.redaction.is_some() { - // PG 18+ uses these path GUCs to load extensions whose files live - // outside the system extension directories. The dalibo - // postgresql_anonymizer image is a full Debian-based Postgres - // image whose filesystem is mounted at /extensions/anon by the - // Pod builder below — so we point the GUCs at the Debian - // extension layout inside that mount. - let pg_major: i32 = pg_version.parse().unwrap_or(18); - extra_config.push_str(&format!( - "extension_control_path = '$system:/extensions/anon/usr/share/postgresql/{pg_major}/extension'\n\ - dynamic_library_path = '$libdir:/extensions/anon/usr/lib/postgresql/{pg_major}/lib'\n", - )); + // The install-anon init container apt-installs + // postgresql_anonymizer_$N from Dalibo's Labs repo and stages the + // extension files under /pgdata/extensions/anon on the restore + // PVC. PG 18+'s runtime-settable path GUCs then point postgres at + // them. PG <18 isn't supported because those GUCs were + // introduced in 18 — older versions would need the files + // overlaid onto /usr/share/postgresql/$N/extension and + // /usr/lib/postgresql/$N/lib at pod-start time, which is a + // separate, larger change. + extra_config.push_str( + "extension_control_path = '$system:/pgdata/extensions/anon/share/extension'\n\ + dynamic_library_path = '$libdir:/pgdata/extensions/anon/lib'\n", + ); } + + let install_anon_script = format!( + r#"set -ex +PG_MAJOR={pg_version} +DEST=/pgdata/extensions/anon + +if [ -f "$DEST/lib/anon.so" ] && [ -f "$DEST/share/extension/anon.control" ]; then + echo "anon already staged at $DEST, skipping install" + exit 0 +fi + +echo "Installing postgresql_anonymizer_${{PG_MAJOR}} from Dalibo Labs..." +export DEBIAN_FRONTEND=noninteractive + +# Bring in Dalibo Labs repo. PGDG (which the postgres:N image is already +# configured against) supplies the postgresql-server-dev-$N runtime +# dependency. +apt-get update +apt-get install -y --no-install-recommends curl ca-certificates gnupg lsb-release + +curl -fsSL https://apt.dalibo.org/labs/debian-dalibo.gpg \ + -o /etc/apt/trusted.gpg.d/dalibo-labs.gpg +echo "deb http://apt.dalibo.org/labs $(lsb_release -cs)-dalibo main" \ + > /etc/apt/sources.list.d/dalibo-labs.list + +apt-get update +apt-get install -y --no-install-recommends "postgresql_anonymizer_${{PG_MAJOR}}" + +echo "Staging extension files to $DEST..." +mkdir -p "$DEST/share/extension" "$DEST/lib" +cp -a "/usr/share/postgresql/${{PG_MAJOR}}/extension/anon"* "$DEST/share/extension/" +cp -a "/usr/lib/postgresql/${{PG_MAJOR}}/lib/anon.so" "$DEST/lib/" + +# Postgres runs as UID 999 and only needs read access. +chown -R 999:999 "$DEST" +chmod -R a+rX "$DEST" + +ls -la "$DEST/share/extension" "$DEST/lib" +echo "anon staged" +"# + ); let extra_config_block = if extra_config.is_empty() { String::new() } else { @@ -1013,33 +1056,34 @@ echo "Auth setup complete" fs_group: Some(999), ..Default::default() }), - init_containers: Some(vec![ - Container { - name: "fix-locale".to_string(), - image: Some(pg_image.clone()), - command: Some(vec!["/bin/sh".to_string(), "-c".to_string()]), - args: Some(vec![locale_script]), - security_context: Some( - k8s_openapi::api::core::v1::SecurityContext { - run_as_user: Some(0), - run_as_group: Some(0), + init_containers: Some({ + let mut inits = vec![ + Container { + name: "fix-locale".to_string(), + image: Some(pg_image.clone()), + command: Some(vec!["/bin/sh".to_string(), "-c".to_string()]), + args: Some(vec![locale_script]), + security_context: Some( + k8s_openapi::api::core::v1::SecurityContext { + run_as_user: Some(0), + run_as_group: Some(0), + ..Default::default() + }, + ), + volume_mounts: Some(vec![VolumeMount { + name: "locale-data".to_string(), + mount_path: "/locale-data".to_string(), ..Default::default() - }, - ), - volume_mounts: Some(vec![VolumeMount { - name: "locale-data".to_string(), - mount_path: "/locale-data".to_string(), - ..Default::default() - }]), - resources: Some(ResourceRequirements { - requests: Some(BTreeMap::from([ - ("cpu".to_string(), Quantity("50m".to_string())), - ("memory".to_string(), Quantity("64Mi".to_string())), - ])), + }]), + resources: Some(ResourceRequirements { + requests: Some(BTreeMap::from([ + ("cpu".to_string(), Quantity("50m".to_string())), + ("memory".to_string(), Quantity("64Mi".to_string())), + ])), + ..Default::default() + }), ..Default::default() - }), - ..Default::default() - }, + }, Container { name: "setup-auth".to_string(), image: Some(pg_image.clone()), @@ -1071,7 +1115,41 @@ echo "Auth setup complete" }), ..Default::default() }, - ]), + ]; + if replica.spec.redaction.is_some() { + inits.push(Container { + name: "install-anon".to_string(), + image: Some(pg_image.clone()), + command: Some(vec!["/bin/sh".to_string(), "-c".to_string()]), + args: Some(vec![install_anon_script]), + security_context: Some( + k8s_openapi::api::core::v1::SecurityContext { + run_as_user: Some(0), + run_as_group: Some(0), + ..Default::default() + }, + ), + volume_mounts: Some(vec![VolumeMount { + name: "pgdata".to_string(), + mount_path: "/pgdata".to_string(), + ..Default::default() + }]), + resources: Some(ResourceRequirements { + requests: Some(BTreeMap::from([ + ("cpu".to_string(), Quantity("100m".to_string())), + ("memory".to_string(), Quantity("128Mi".to_string())), + ])), + limits: Some(BTreeMap::from([ + ("cpu".to_string(), Quantity("1".to_string())), + ("memory".to_string(), Quantity("512Mi".to_string())), + ])), + ..Default::default() + }), + ..Default::default() + }); + } + inits + }), containers: vec![Container { name: "postgres".to_string(), image: Some(pg_image), @@ -1128,34 +1206,23 @@ exec postgres -D /pgdata/pgdata ${PGRO_LOG_LEVEL:+-c log_min_messages=$PGRO_LOG_ protocol: Some("TCP".to_string()), ..Default::default() }]), - volume_mounts: Some({ - let mut mounts = vec![ - VolumeMount { - name: "pgdata".to_string(), - mount_path: "/pgdata".to_string(), - ..Default::default() - }, - VolumeMount { - name: "locale-data".to_string(), - mount_path: "/usr/lib/locale".to_string(), - ..Default::default() - }, - VolumeMount { - name: "dshm".to_string(), - mount_path: "/dev/shm".to_string(), - ..Default::default() - }, - ]; - if replica.spec.redaction.is_some() { - mounts.push(VolumeMount { - name: "anon-extension".to_string(), - mount_path: "/extensions/anon".to_string(), - read_only: Some(true), - ..Default::default() - }); - } - mounts - }), + volume_mounts: Some(vec![ + VolumeMount { + name: "pgdata".to_string(), + mount_path: "/pgdata".to_string(), + ..Default::default() + }, + VolumeMount { + name: "locale-data".to_string(), + mount_path: "/usr/lib/locale".to_string(), + ..Default::default() + }, + VolumeMount { + name: "dshm".to_string(), + mount_path: "/dev/shm".to_string(), + ..Default::default() + }, + ]), readiness_probe: Some(Probe { exec: Some(ExecAction { command: Some(vec![ @@ -1189,54 +1256,35 @@ exec postgres -D /pgdata/pgdata ${PGRO_LOG_LEVEL:+-c log_min_messages=$PGRO_LOG_ resources: replica.spec.resources.clone(), ..Default::default() }], - volumes: Some({ - let mut volumes = vec![ - Volume { - name: "pgdata".to_string(), - persistent_volume_claim: Some( - k8s_openapi::api::core::v1::PersistentVolumeClaimVolumeSource { - claim_name: pvc_name, - read_only: Some(false), - }, - ), - ..Default::default() - }, - Volume { - name: "locale-data".to_string(), - empty_dir: Some( - k8s_openapi::api::core::v1::EmptyDirVolumeSource::default(), - ), - ..Default::default() - }, - Volume { - name: "dshm".to_string(), - empty_dir: Some( - k8s_openapi::api::core::v1::EmptyDirVolumeSource { - medium: Some("Memory".to_string()), - size_limit: Some(shm_size), - }, - ), - ..Default::default() - }, - ]; - if let Some(ref redaction) = replica.spec.redaction { - let image = redaction - .extension_image - .clone() - .unwrap_or_else(|| crate::types::DEFAULT_ANON_IMAGE.to_string()); - volumes.push(Volume { - name: "anon-extension".to_string(), - image: Some( - k8s_openapi::api::core::v1::ImageVolumeSource { - reference: Some(image), - pull_policy: None, - }, - ), - ..Default::default() - }); - } - volumes - }), + volumes: Some(vec![ + Volume { + name: "pgdata".to_string(), + persistent_volume_claim: Some( + k8s_openapi::api::core::v1::PersistentVolumeClaimVolumeSource { + claim_name: pvc_name, + read_only: Some(false), + }, + ), + ..Default::default() + }, + Volume { + name: "locale-data".to_string(), + empty_dir: Some( + k8s_openapi::api::core::v1::EmptyDirVolumeSource::default(), + ), + ..Default::default() + }, + Volume { + name: "dshm".to_string(), + empty_dir: Some( + k8s_openapi::api::core::v1::EmptyDirVolumeSource { + medium: Some("Memory".to_string()), + size_limit: Some(shm_size), + }, + ), + ..Default::default() + }, + ]), affinity: replica.spec.affinity.clone(), tolerations: Some(replica.spec.tolerations.clone()), ..Default::default() diff --git a/src/controllers/restore/tests.rs b/src/controllers/restore/tests.rs index c5e99be..0d6f10e 100644 --- a/src/controllers/restore/tests.rs +++ b/src/controllers/restore/tests.rs @@ -674,7 +674,7 @@ fn deployment_without_redaction_has_no_anon_volume() { } #[test] -fn deployment_with_redaction_mounts_anon_image_volume() { +fn deployment_with_redaction_adds_install_anon_init_container() { let (mut restore, mut replica) = test_restore_and_replica(); restore.status = Some(PostgresPhysicalRestoreStatus { postgres_version: Some("18".to_string()), @@ -685,7 +685,6 @@ fn deployment_with_redaction_mounts_anon_image_volume() { version: None, version_query: None, version_fallback_to_base: false, - extension_image: None, }); let deploy = build_deployment(&restore, "test-restore", "default", &replica).unwrap(); @@ -698,25 +697,32 @@ fn deployment_with_redaction_mounts_anon_image_volume() { .as_ref() .unwrap(); - let anon_volume = pod - .volumes + let install_anon = pod + .init_containers .as_ref() .unwrap() .iter() - .find(|v| v.name == "anon-extension") - .expect("anon-extension volume must be present"); - let image = anon_volume.image.as_ref().expect("must be an image volume"); - assert_eq!(image.reference.as_deref(), Some(DEFAULT_ANON_IMAGE)); + .find(|c| c.name == "install-anon") + .expect("install-anon init container must be present"); + let script = &install_anon.args.as_ref().unwrap()[0]; + assert!( + script.contains("PG_MAJOR=18"), + "install script must pin PG_MAJOR to the restore's PG version, got: {script}" + ); + assert!( + script.contains("postgresql_anonymizer_${PG_MAJOR}"), + "install script must apt-install the PG-major-specific anon package, got: {script}" + ); + assert!( + script.contains("/pgdata/extensions/anon"), + "install script must stage files under the redaction destination, got: {script}" + ); let postgres = &pod.containers[0]; + let postgres_mounts = postgres.volume_mounts.as_ref().unwrap(); assert!( - postgres - .volume_mounts - .as_ref() - .unwrap() - .iter() - .any(|m| m.name == "anon-extension" && m.mount_path == "/extensions/anon"), - "postgres container must mount anon-extension at /extensions/anon" + postgres_mounts.iter().any(|m| m.name == "pgdata"), + "postgres container must mount pgdata" ); let setup_auth = deploy_init_setup_auth_script(&deploy); @@ -724,6 +730,10 @@ fn deployment_with_redaction_mounts_anon_image_volume() { setup_auth.contains("extension_control_path"), "init script must append extension_control_path GUC" ); + assert!( + setup_auth.contains("/pgdata/extensions/anon/share/extension"), + "extension_control_path must point at the PVC staging directory" + ); } #[test] @@ -738,7 +748,6 @@ fn deployment_with_redaction_rejects_pg17() { version: None, version_query: None, version_fallback_to_base: false, - extension_image: None, }); let err = build_deployment(&restore, "test-restore", "default", &replica).unwrap_err(); @@ -762,7 +771,6 @@ fn deployment_with_redaction_forces_writable() { version: None, version_query: None, version_fallback_to_base: false, - extension_image: None, }); let deploy = build_deployment(&restore, "test-restore", "default", &replica).unwrap(); diff --git a/src/types/replica.rs b/src/types/replica.rs index 0f2e644..6fc3869 100644 --- a/src/types/replica.rs +++ b/src/types/replica.rs @@ -136,17 +136,8 @@ pub struct RedactionSpec { /// with the major.minor.0 base version. #[serde(default)] pub version_fallback_to_base: bool, - - /// Override the OCI image used as the source of the - /// postgresql_anonymizer extension files (mounted as an image volume - /// on the restore Pod). Defaults to - /// `registry.gitlab.com/dalibo/postgresql_anonymizer:stable`. - #[serde(default, skip_serializing_if = "Option::is_none")] - pub extension_image: Option, } -pub const DEFAULT_ANON_IMAGE: &str = "registry.gitlab.com/dalibo/postgresql_anonymizer:stable"; - fn default_storage_size_maximum() -> Quantity { Quantity("2Ti".to_string()) } diff --git a/tests/fixtures/Dockerfile.anon-pg18 b/tests/fixtures/Dockerfile.anon-pg18 deleted file mode 100644 index 1314c52..0000000 --- a/tests/fixtures/Dockerfile.anon-pg18 +++ /dev/null @@ -1,34 +0,0 @@ -# Minimal PG-18 anon extension image for the redaction integration test. -# -# Dalibo's published :stable image is built against PG 16, so for our PG 18 -# restore we install postgresql_anonymizer_18 from Dalibo's apt repo and -# strip the result down to the extension files. The result is a tiny image -# that's mounted at /extensions/anon on the restore Pod via the image- -# volume mechanism. -FROM debian:bookworm-slim AS builder - -RUN apt-get update \ - && apt-get install -y --no-install-recommends \ - curl ca-certificates gnupg lsb-release \ - && rm -rf /var/lib/apt/lists/* - -# PGDG repo for postgresql-server-dev-18 (the anon package's hard dep) -RUN curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc \ - | gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg \ - && echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" \ - > /etc/apt/sources.list.d/pgdg.list - -# Dalibo Labs repo for postgresql_anonymizer_18 -RUN curl -fsSL https://apt.dalibo.org/labs/debian-dalibo.gpg \ - -o /etc/apt/trusted.gpg.d/dalibo-labs.gpg \ - && echo "deb http://apt.dalibo.org/labs $(lsb_release -cs)-dalibo main" \ - > /etc/apt/sources.list.d/dalibo-labs.list - -RUN apt-get update \ - && apt-get install -y --no-install-recommends postgresql_anonymizer_18 \ - && rm -rf /var/lib/apt/lists/* - -# scratch destination: only the extension files end up in the final image -FROM scratch -COPY --from=builder /usr/share/postgresql/18/extension/ /usr/share/postgresql/18/extension/ -COPY --from=builder /usr/lib/postgresql/18/lib/anon.so /usr/lib/postgresql/18/lib/anon.so diff --git a/tests/redaction.rs b/tests/redaction.rs index 51283b6..07d118a 100644 --- a/tests/redaction.rs +++ b/tests/redaction.rs @@ -293,12 +293,6 @@ fn build_redaction_replica(name: &str, secret_ref: &str) -> PostgresPhysicalRepl "SELECT value FROM local_system_facts WHERE key = 'currentVersion'".into(), ), version_fallback_to_base: false, - // The default registry.gitlab.com/.../postgresql_anonymizer:stable - // is built against PG 16 today; the workflow builds a tiny - // PG-18 anon image from Dalibo's apt repo (see - // tests/fixtures/Dockerfile.anon-pg18) and `kind load`s it - // under this tag. - extension_image: Some("test-anon-pg18:integ".into()), }), storage_size_maximum: Quantity("2Ti".into()), }, From f520616bc155a632acc2d2e2c6423ed7e5fe4296 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Thu, 14 May 2026 01:03:54 +1200 Subject: [PATCH 10/11] refactor(redaction): drop PG-18 gate; install anon in the postgres container prelude Moves the anon-extension install from a separate init container to the postgres container's own command wrapper. The container runs as root for the prelude (which apt-installs postgresql_anonymizer_$N from Dalibo Labs and copies anon.{control,sql,so} into the standard system extension dirs of its own writable layer), then drops to UID 999 via `gosu postgres` before exec'ing postgres. This unlocks all PG majors the operator otherwise supports: postgres finds anon at /usr/share/postgresql/$N/extension and /usr/lib/ postgresql/$N/lib (no extension_control_path GUC needed) so the PG-18 restriction goes away. Drops: - the install-anon init container - extension_control_path / dynamic_library_path GUCs in postgresql.conf - the PG-18 build-time gate in restore/builders.rs - the PG-18 reconcile-time gate in replica/redaction.rs Adds: - a per-container securityContext (runAsUser=0) on the postgres container when redaction is set, overriding the pod-level UID 999 - REDACTION_ENABLED=1 env var to drive the prelude - a /pgdata/.anon-cache PVC cache so the apt-install runs once per restore (not per pod restart) --- README.md | 5 +- src/controllers/replica/redaction.rs | 15 -- src/controllers/restore/builders.rs | 264 ++++++++++++--------------- src/controllers/restore/tests.rs | 131 +++++++------ 4 files changed, 192 insertions(+), 223 deletions(-) diff --git a/README.md b/README.md index 0e2c96a..d8d8dbb 100644 --- a/README.md +++ b/README.md @@ -139,8 +139,9 @@ spec: Notes: -- Requires PostgreSQL 18+ on the restore (uses the runtime-settable `extension_control_path` and `dynamic_library_path` GUCs). -- An `install-anon` init container apt-installs `postgresql_anonymizer_$N` from [Dalibo Labs](https://apt.dalibo.org/labs/) and stages the extension files onto the restore PVC under `/pgdata/extensions/anon/`. This adds roughly 30 seconds to each restore, and avoids needing a pre-built per-PG-version extension image. The install is idempotent across pod restarts (skipped if the files are already staged). +- Works on any PostgreSQL major the operator otherwise supports. There's no PG-version gate because the prelude apt-installs `postgresql_anonymizer_$N` from [Dalibo Labs](https://apt.dalibo.org/labs/) per the running restore's PG version and copies the files into the standard system extension dirs (`/usr/share/postgresql/$N/extension`, `/usr/lib/postgresql/$N/lib`). +- The download is cached on the restore PVC at `/pgdata/.anon-cache/`, so a pod restart doesn't re-fetch the package — it just re-copies the cached files into the (fresh) container writable layer. +- The postgres container runs as root for the prelude (to apt-install and write to system paths) then drops back to UID 999 via `gosu` before exec'ing `postgres`. `gosu` is preinstalled in the official `postgres` image. - During redaction the database is writable; once anonymisation completes, the operator sets `default_transaction_read_only = on` at the database level and demotes the analytics user back to non-superuser when `spec.readOnly` is true. #### SnapshotFilter diff --git a/src/controllers/replica/redaction.rs b/src/controllers/replica/redaction.rs index 952ab16..b77d922 100644 --- a/src/controllers/replica/redaction.rs +++ b/src/controllers/replica/redaction.rs @@ -58,21 +58,6 @@ pub async fn reconcile_redaction_step( _ => {} } - let pg_version = switching - .status - .as_ref() - .and_then(|s| s.postgres_version.as_deref()); - let major: i32 = pg_version.and_then(|v| v.parse().ok()).unwrap_or(0); - if major < 18 { - let msg = format!( - "failed: redaction requires PostgreSQL 18+, restore is PG {}", - pg_version.unwrap_or("unknown") - ); - warn!(replica = %replica_name, version = pg_version, %msg); - patch_phase_only(ctx, &replica_name, &namespace, &msg).await?; - return Ok(false); - } - if phase != Some("active") { patch_phase_only(ctx, &replica_name, &namespace, "active").await?; } diff --git a/src/controllers/restore/builders.rs b/src/controllers/restore/builders.rs index ec09f2c..43882e7 100644 --- a/src/controllers/restore/builders.rs +++ b/src/controllers/restore/builders.rs @@ -658,15 +658,6 @@ pub fn build_deployment( .cloned() .ok_or_else(|| Error::MissingField("status.postgresVersion".to_string()))?; - if replica.spec.redaction.is_some() { - let major: i32 = pg_version.parse().unwrap_or(0); - if major < 18 { - return Err(Error::Redaction(format!( - "redaction requires PostgreSQL 18+, restore is PG {pg_version}", - ))); - } - } - let pg_image = format!("postgres:{pg_version}"); let locale_script = r#"set -ex @@ -699,67 +690,13 @@ cp -a /usr/lib/locale/* /locale-data/ && replica.spec.redaction.is_none(); let read_only = effective_read_only.to_string(); - let mut extra_config = String::new(); - if let Some(ref extra) = replica.spec.postgres_extra_config { - extra_config.push_str(extra); - extra_config.push('\n'); - } - if replica.spec.redaction.is_some() { - // The install-anon init container apt-installs - // postgresql_anonymizer_$N from Dalibo's Labs repo and stages the - // extension files under /pgdata/extensions/anon on the restore - // PVC. PG 18+'s runtime-settable path GUCs then point postgres at - // them. PG <18 isn't supported because those GUCs were - // introduced in 18 — older versions would need the files - // overlaid onto /usr/share/postgresql/$N/extension and - // /usr/lib/postgresql/$N/lib at pod-start time, which is a - // separate, larger change. - extra_config.push_str( - "extension_control_path = '$system:/pgdata/extensions/anon/share/extension'\n\ - dynamic_library_path = '$libdir:/pgdata/extensions/anon/lib'\n", - ); - } + let extra_config = replica + .spec + .postgres_extra_config + .clone() + .map(|s| format!("{s}\n")) + .unwrap_or_default(); - let install_anon_script = format!( - r#"set -ex -PG_MAJOR={pg_version} -DEST=/pgdata/extensions/anon - -if [ -f "$DEST/lib/anon.so" ] && [ -f "$DEST/share/extension/anon.control" ]; then - echo "anon already staged at $DEST, skipping install" - exit 0 -fi - -echo "Installing postgresql_anonymizer_${{PG_MAJOR}} from Dalibo Labs..." -export DEBIAN_FRONTEND=noninteractive - -# Bring in Dalibo Labs repo. PGDG (which the postgres:N image is already -# configured against) supplies the postgresql-server-dev-$N runtime -# dependency. -apt-get update -apt-get install -y --no-install-recommends curl ca-certificates gnupg lsb-release - -curl -fsSL https://apt.dalibo.org/labs/debian-dalibo.gpg \ - -o /etc/apt/trusted.gpg.d/dalibo-labs.gpg -echo "deb http://apt.dalibo.org/labs $(lsb_release -cs)-dalibo main" \ - > /etc/apt/sources.list.d/dalibo-labs.list - -apt-get update -apt-get install -y --no-install-recommends "postgresql_anonymizer_${{PG_MAJOR}}" - -echo "Staging extension files to $DEST..." -mkdir -p "$DEST/share/extension" "$DEST/lib" -cp -a "/usr/share/postgresql/${{PG_MAJOR}}/extension/anon"* "$DEST/share/extension/" -cp -a "/usr/lib/postgresql/${{PG_MAJOR}}/lib/anon.so" "$DEST/lib/" - -# Postgres runs as UID 999 and only needs read access. -chown -R 999:999 "$DEST" -chmod -R a+rX "$DEST" - -ls -la "$DEST/share/extension" "$DEST/lib" -echo "anon staged" -"# - ); let extra_config_block = if extra_config.is_empty() { String::new() } else { @@ -1056,34 +993,31 @@ echo "Auth setup complete" fs_group: Some(999), ..Default::default() }), - init_containers: Some({ - let mut inits = vec![ - Container { - name: "fix-locale".to_string(), - image: Some(pg_image.clone()), - command: Some(vec!["/bin/sh".to_string(), "-c".to_string()]), - args: Some(vec![locale_script]), - security_context: Some( - k8s_openapi::api::core::v1::SecurityContext { - run_as_user: Some(0), - run_as_group: Some(0), - ..Default::default() - }, - ), - volume_mounts: Some(vec![VolumeMount { - name: "locale-data".to_string(), - mount_path: "/locale-data".to_string(), - ..Default::default() - }]), - resources: Some(ResourceRequirements { - requests: Some(BTreeMap::from([ - ("cpu".to_string(), Quantity("50m".to_string())), - ("memory".to_string(), Quantity("64Mi".to_string())), - ])), - ..Default::default() - }), + init_containers: Some(vec![ + Container { + name: "fix-locale".to_string(), + image: Some(pg_image.clone()), + command: Some(vec!["/bin/sh".to_string(), "-c".to_string()]), + args: Some(vec![locale_script]), + security_context: Some(k8s_openapi::api::core::v1::SecurityContext { + run_as_user: Some(0), + run_as_group: Some(0), ..Default::default() - }, + }), + volume_mounts: Some(vec![VolumeMount { + name: "locale-data".to_string(), + mount_path: "/locale-data".to_string(), + ..Default::default() + }]), + resources: Some(ResourceRequirements { + requests: Some(BTreeMap::from([ + ("cpu".to_string(), Quantity("50m".to_string())), + ("memory".to_string(), Quantity("64Mi".to_string())), + ])), + ..Default::default() + }), + ..Default::default() + }, Container { name: "setup-auth".to_string(), image: Some(pg_image.clone()), @@ -1115,46 +1049,47 @@ echo "Auth setup complete" }), ..Default::default() }, - ]; - if replica.spec.redaction.is_some() { - inits.push(Container { - name: "install-anon".to_string(), - image: Some(pg_image.clone()), - command: Some(vec!["/bin/sh".to_string(), "-c".to_string()]), - args: Some(vec![install_anon_script]), - security_context: Some( - k8s_openapi::api::core::v1::SecurityContext { - run_as_user: Some(0), - run_as_group: Some(0), - ..Default::default() - }, - ), - volume_mounts: Some(vec![VolumeMount { - name: "pgdata".to_string(), - mount_path: "/pgdata".to_string(), - ..Default::default() - }]), - resources: Some(ResourceRequirements { - requests: Some(BTreeMap::from([ - ("cpu".to_string(), Quantity("100m".to_string())), - ("memory".to_string(), Quantity("128Mi".to_string())), - ])), - limits: Some(BTreeMap::from([ - ("cpu".to_string(), Quantity("1".to_string())), - ("memory".to_string(), Quantity("512Mi".to_string())), - ])), - ..Default::default() - }), - ..Default::default() - }); - } - inits - }), + ]), containers: vec![Container { name: "postgres".to_string(), image: Some(pg_image), command: Some(vec!["/bin/sh".to_string(), "-c".to_string()]), - args: Some(vec![r#" + args: Some(vec![format!( + r#" +PG_MAJOR={pg_version} + +# When spec.redaction is configured, the operator sets REDACTION_ENABLED=1 +# so that we apt-install postgresql_anonymizer_$PG_MAJOR and drop the +# files into the standard system extension dirs of this container's +# (fresh) writable filesystem layer. The PVC-backed cache at +# /pgdata/.anon-cache avoids re-downloading on every pod restart. +if [ "${{REDACTION_ENABLED:-0}}" = "1" ]; then + if [ ! -f /pgdata/.anon-cache/anon.so ] || [ ! -f /pgdata/.anon-cache/anon.control ]; then + echo "Installing postgresql_anonymizer_${{PG_MAJOR}} from Dalibo Labs..." + export DEBIAN_FRONTEND=noninteractive + apt-get update + apt-get install -y --no-install-recommends curl ca-certificates gnupg lsb-release + curl -fsSL https://apt.dalibo.org/labs/debian-dalibo.gpg \ + -o /etc/apt/trusted.gpg.d/dalibo-labs.gpg + echo "deb http://apt.dalibo.org/labs $(lsb_release -cs)-dalibo main" \ + > /etc/apt/sources.list.d/dalibo-labs.list + apt-get update + apt-get install -y --no-install-recommends "postgresql_anonymizer_${{PG_MAJOR}}" + mkdir -p /pgdata/.anon-cache + cp -a "/usr/share/postgresql/${{PG_MAJOR}}/extension/anon"* /pgdata/.anon-cache/ + cp -a "/usr/lib/postgresql/${{PG_MAJOR}}/lib/anon.so" /pgdata/.anon-cache/ + chown -R 999:999 /pgdata/.anon-cache + else + echo "anon already cached on PVC, skipping install" + fi + + # Drop the files into this container's writable layer at the standard + # system paths. Cheap (<1s) and has to happen every pod start because + # the writable layer doesn't persist across restarts. + cp -a /pgdata/.anon-cache/anon* "/usr/share/postgresql/${{PG_MAJOR}}/extension/" + cp -a /pgdata/.anon-cache/anon.so "/usr/lib/postgresql/${{PG_MAJOR}}/lib/" +fi + if [ -f /pgdata/needs-reindex ]; then PG_MAJOR=$(cat /pgdata/pgdata/PG_VERSION) ( @@ -1186,20 +1121,49 @@ if [ -f /pgdata/needs-reindex ]; then echo "Background reindex complete" ) & fi -exec postgres -D /pgdata/pgdata ${PGRO_LOG_LEVEL:+-c log_min_messages=$PGRO_LOG_LEVEL} -"#.to_string()]), - env: Some(vec![ - EnvVar { - name: "PGDATA".to_string(), - value: Some("/pgdata/pgdata".to_string()), - ..Default::default() - }, - EnvVar { - name: "POSTGRES_HOST_AUTH_METHOD".to_string(), - value: Some("scram-sha-256".to_string()), + +# Drop privileges to UID 999 (postgres) before launching the server. +# The pod's PodSecurityContext requests UID 999 for the entire pod, but +# this container overrides to root via runAsUser=0 above so we can do +# the apt install + cp prelude. gosu hands off cleanly without a +# trampoline shell. +exec gosu postgres postgres -D /pgdata/pgdata ${{PGRO_LOG_LEVEL:+-c log_min_messages=$PGRO_LOG_LEVEL}} +"# + )]), + env: Some({ + let mut env = vec![ + EnvVar { + name: "PGDATA".to_string(), + value: Some("/pgdata/pgdata".to_string()), + ..Default::default() + }, + EnvVar { + name: "POSTGRES_HOST_AUTH_METHOD".to_string(), + value: Some("scram-sha-256".to_string()), + ..Default::default() + }, + ]; + if replica.spec.redaction.is_some() { + env.push(EnvVar { + name: "REDACTION_ENABLED".to_string(), + value: Some("1".to_string()), + ..Default::default() + }); + } + env + }), + security_context: replica.spec.redaction.is_some().then(|| { + // When redaction is set, the postgres container's + // prelude apt-installs anon and copies files into + // /usr/{share,lib}/postgresql/$N/... — both root- + // only operations. gosu drops back to UID 999 + // before exec'ing postgres itself. + k8s_openapi::api::core::v1::SecurityContext { + run_as_user: Some(0), + run_as_group: Some(0), ..Default::default() - }, - ]), + } + }), ports: Some(vec![ContainerPort { name: Some("postgres".to_string()), container_port: 5432, @@ -1276,12 +1240,10 @@ exec postgres -D /pgdata/pgdata ${PGRO_LOG_LEVEL:+-c log_min_messages=$PGRO_LOG_ }, Volume { name: "dshm".to_string(), - empty_dir: Some( - k8s_openapi::api::core::v1::EmptyDirVolumeSource { - medium: Some("Memory".to_string()), - size_limit: Some(shm_size), - }, - ), + empty_dir: Some(k8s_openapi::api::core::v1::EmptyDirVolumeSource { + medium: Some("Memory".to_string()), + size_limit: Some(shm_size), + }), ..Default::default() }, ]), diff --git a/src/controllers/restore/tests.rs b/src/controllers/restore/tests.rs index 0d6f10e..17bf443 100644 --- a/src/controllers/restore/tests.rs +++ b/src/controllers/restore/tests.rs @@ -648,33 +648,7 @@ fn deployment_shared_buffers_with_custom_resources() { } #[test] -fn deployment_without_redaction_has_no_anon_volume() { - let (mut restore, replica) = test_restore_and_replica(); - restore.status = Some(PostgresPhysicalRestoreStatus { - postgres_version: Some("18".to_string()), - ..Default::default() - }); - let deploy = build_deployment(&restore, "test-restore", "default", &replica).unwrap(); - let pod = deploy - .spec - .as_ref() - .unwrap() - .template - .spec - .as_ref() - .unwrap(); - let volume_names: Vec<&str> = pod - .volumes - .as_ref() - .unwrap() - .iter() - .map(|v| v.name.as_str()) - .collect(); - assert!(!volume_names.contains(&"anon-extension")); -} - -#[test] -fn deployment_with_redaction_adds_install_anon_init_container() { +fn deployment_with_redaction_runs_postgres_as_root_and_sets_redaction_env() { let (mut restore, mut replica) = test_restore_and_replica(); restore.status = Some(PostgresPhysicalRestoreStatus { postgres_version: Some("18".to_string()), @@ -697,50 +671,83 @@ fn deployment_with_redaction_adds_install_anon_init_container() { .as_ref() .unwrap(); - let install_anon = pod - .init_containers - .as_ref() - .unwrap() + let postgres = pod + .containers .iter() - .find(|c| c.name == "install-anon") - .expect("install-anon init container must be present"); - let script = &install_anon.args.as_ref().unwrap()[0]; + .find(|c| c.name == "postgres") + .expect("postgres container must be present"); + + let sec = postgres + .security_context + .as_ref() + .expect("postgres container must override securityContext when redaction is set"); + assert_eq!(sec.run_as_user, Some(0), "postgres must run as root"); + + let env = postgres.env.as_ref().unwrap(); + assert!( + env.iter() + .any(|e| e.name == "REDACTION_ENABLED" && e.value.as_deref() == Some("1")), + "REDACTION_ENABLED=1 must be set so the prelude installs anon" + ); + + let script = &postgres.args.as_ref().unwrap()[0]; assert!( script.contains("PG_MAJOR=18"), - "install script must pin PG_MAJOR to the restore's PG version, got: {script}" + "prelude must pin PG_MAJOR to the restore's PG version, got: {script}" ); assert!( script.contains("postgresql_anonymizer_${PG_MAJOR}"), - "install script must apt-install the PG-major-specific anon package, got: {script}" + "prelude must apt-install the PG-major-specific anon package" ); assert!( - script.contains("/pgdata/extensions/anon"), - "install script must stage files under the redaction destination, got: {script}" + script.contains("exec gosu postgres postgres"), + "prelude must drop privileges via gosu before exec'ing postgres" ); +} - let postgres = &pod.containers[0]; - let postgres_mounts = postgres.volume_mounts.as_ref().unwrap(); - assert!( - postgres_mounts.iter().any(|m| m.name == "pgdata"), - "postgres container must mount pgdata" - ); +#[test] +fn deployment_without_redaction_keeps_default_securitycontext() { + let (mut restore, replica) = test_restore_and_replica(); + restore.status = Some(PostgresPhysicalRestoreStatus { + postgres_version: Some("18".to_string()), + ..Default::default() + }); - let setup_auth = deploy_init_setup_auth_script(&deploy); + let deploy = build_deployment(&restore, "test-restore", "default", &replica).unwrap(); + let pod = deploy + .spec + .as_ref() + .unwrap() + .template + .spec + .as_ref() + .unwrap(); + let postgres = pod + .containers + .iter() + .find(|c| c.name == "postgres") + .unwrap(); assert!( - setup_auth.contains("extension_control_path"), - "init script must append extension_control_path GUC" + postgres.security_context.is_none(), + "postgres container must inherit the pod-level UID 999 when redaction is off" ); + let env = postgres.env.as_ref().unwrap(); assert!( - setup_auth.contains("/pgdata/extensions/anon/share/extension"), - "extension_control_path must point at the PVC staging directory" + !env.iter().any(|e| e.name == "REDACTION_ENABLED"), + "REDACTION_ENABLED must not be set when redaction is off" ); } #[test] -fn deployment_with_redaction_rejects_pg17() { +fn deployment_with_redaction_builds_for_pg16() { + // Redaction used to be gated to PG 18+ when we relied on the + // extension_control_path GUC. Now the postgres container's prelude + // drops the files into /usr/share/postgresql/$N/extension and + // /usr/lib/postgresql/$N/lib of its own writable layer, so any PG + // major works. let (mut restore, mut replica) = test_restore_and_replica(); restore.status = Some(PostgresPhysicalRestoreStatus { - postgres_version: Some("17".to_string()), + postgres_version: Some("16".to_string()), ..Default::default() }); replica.spec.redaction = Some(RedactionSpec { @@ -750,11 +757,25 @@ fn deployment_with_redaction_rejects_pg17() { version_fallback_to_base: false, }); - let err = build_deployment(&restore, "test-restore", "default", &replica).unwrap_err(); - let msg = format!("{err}"); + let deploy = build_deployment(&restore, "test-restore", "default", &replica) + .expect("redaction should build on PG 16"); + let pod = deploy + .spec + .as_ref() + .unwrap() + .template + .spec + .as_ref() + .unwrap(); + let postgres = pod + .containers + .iter() + .find(|c| c.name == "postgres") + .unwrap(); + let script = &postgres.args.as_ref().unwrap()[0]; assert!( - msg.contains("PostgreSQL 18"), - "error should mention PG 18+ requirement, got: {msg}" + script.contains("PG_MAJOR=16"), + "prelude must use the restore's PG major (16), got: {script}" ); } From cf9c5024b7155803e6744d120ff33f83adf1416c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Saparelli?= Date: Thu, 14 May 2026 03:18:24 +1200 Subject: [PATCH 11/11] fix(redaction): compose full names from fake_first_name + fake_last_name MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit anon doesn't ship a fake_name() function — only fake_first_name() and fake_last_name() per the dalibo docs at https://postgresql-anonymizer.readthedocs.io/en/stable/masking_functions/ The name-mask CASE composed the with-space branch as fake_name(), which would have failed at SECURITY LABEL time with a tolerated error. Compose first || ' ' || last for the with-space case; single names still use fake_first_name(). Verified the rest of the registry against the docs at the same time; no other functions needed adjustment. --- src/controllers/replica/redaction/mask.rs | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/src/controllers/replica/redaction/mask.rs b/src/controllers/replica/redaction/mask.rs index 13cc536..8644c99 100644 --- a/src/controllers/replica/redaction/mask.rs +++ b/src/controllers/replica/redaction/mask.rs @@ -95,7 +95,9 @@ pub fn fragment_for(mask: &ColumnMask, info: Option<&ColumnInfo>) -> Result Ok(Fragment::Function(null_pres( &col, format!( - "CASE WHEN {col} LIKE '% %' THEN anon.fake_name() ELSE anon.fake_first_name() END" + "CASE WHEN {col} LIKE '% %' \ + THEN anon.fake_first_name() || ' ' || anon.fake_last_name() \ + ELSE anon.fake_first_name() END" ), ))), @@ -271,8 +273,10 @@ mod tests { let f = fragment_for(&cm("name", None), None).unwrap(); let rendered = f.render(); assert!(rendered.contains("LIKE '% %'")); - assert!(rendered.contains("fake_name()")); - assert!(rendered.contains("fake_first_name()")); + // anon doesn't ship fake_name(); compose first + last for the + // with-space branch. + assert!(rendered.contains("fake_first_name() || ' ' || anon.fake_last_name()")); + assert!(rendered.contains("ELSE anon.fake_first_name()")); } #[test]