Skip to content

feat: emit user pg_hba/pg_ident entries#400

Open
moizpgedge wants to merge 1 commit into
mainfrom
feat/PLAT-628/Generator-wiring-both-orchestrators-password_encryption-IPv6-reject
Open

feat: emit user pg_hba/pg_ident entries#400
moizpgedge wants to merge 1 commit into
mainfrom
feat/PLAT-628/Generator-wiring-both-orchestrators-password_encryption-IPv6-reject

Conversation

@moizpgedge
Copy link
Copy Markdown
Contributor

@moizpgedge moizpgedge commented Jun 4, 2026

PR: feat: emit user pg_hba/pg_ident entries

Wire pg_hba_conf and pg_ident_conf from the instance spec into both the Swarm and common/systemd Patroni config generators, so the entries now reach the generated pg_hba.conf / pg_ident.conf. The previous change only accepted, validated, and stored them.

  • User entries form a zone after the CP's system-user and bridge-isolation rules and before the catch-all, so they cannot affect control-plane-internal connectivity. Node-level entries are already prepended ahead of database-level entries by NodeInstances().
  • The Block 3 catch-all auth method now follows password_encryption (defaults to md5 when unset), so user passwords and the fallback stay in the same auth landscape.
  • Swarm: add the IPv6 (::/0) system-user reject the common path already had, so a permissive user rule can't reach a system user over IPv6 now that the user zone sits below the reject.
  • Populate PgIdent (previously always nil); it is purely user-supplied.
  • Reload is unchanged (SIGHUP); the generator just emits the entries.

Adds golden tests for both generators to guard against divergence.

PLAT-628


Summary

Makes the pg_hba_conf / pg_ident_conf database-spec fields take effect: both Patroni config generators (Swarm and common/systemd) now emit the user entries into the generated pg_hba.conf / pg_ident.conf, the catch-all auth method follows password_encryption, and the Swarm path gains the IPv6 system-user reject. The prior PR added the spec contract, parser, and validation only.

Changes

  • Swarm generator (server/internal/orchestrator/swarm/patroni_config.go): insert the user zone between the bridge-subnet reject and the catch-all; add the ::/0 system-user reject; the catch-all auth method follows password_encryption (default md5); populate PgIdent from the spec. Final order: system reject (v4+v6) → gateway md5 → bridge-subnet reject → user zone → catch-all.
  • Common/systemd generator (server/internal/orchestrator/common/patroni_config_generator.go): add PgHbaConf/PgIdentConf to the generator (wired from opts.Instance, like SpecParameters); emit the user zone at the existing extraEntries slot; both v4/v6 catch-alls follow password_encryption; populate PgIdent via a pgIdent() helper.
  • Tests: golden tests for both generators (swarm: new patroni_config_golden_test.go + main_test.go for the -update flag; common: new user pg_hba pg_ident and scram case), covering user-zone position, IPv6 reject, scram-vs-md5 catch-all, and pg_ident.

No behavior change when the fields are empty/unset — the new struct fields use omitempty and the user-zone append is a no-op, so generated output is byte-identical to today.

Testing

# build / static checks
go build ./...
go vet ./server/internal/...
make lint                       # golangci-lint: 0 issues

# golden + unit tests (both generators; full internal suite)
go test ./server/internal/orchestrator/swarm/...   -run TestGeneratePatroniConfig
go test ./server/internal/orchestrator/common/...  -run TestPatroniConfigGenerator
go test ./server/internal/...
# regenerate goldens after an intended change: add -update

Manual verification against a local dev cluster (restish + docker exec), with password_encryption: scram-sha-256 and a cert rule using map=:

  • Create a DB with database- and node-level pg_hba_conf, a pg_ident_conf mapping, and password_encryption. Confirmed in the running container's pg_hba.conf:
    • ::/0 system-user reject present (alongside 0.0.0.0/0);
    • user entries verbatim in the zone, node-level entry first, between the bridge-subnet reject and the catch-all;
    • catch-all is scram-sha-256;
    • pg_ident.conf contains the mapping, referenced by the map=ssl_users cert rule.
  • Update an entry → the change appears in the live file, and the container is not restarted (same container id / StartedAt — SIGHUP reload).

Checklist

  • Tests added or updated (unit and/or e2e, as needed) — golden tests for both generators
  • Documentation updated (if needed) — deferred to the docs/e2e ticket
  • Issue is linked (branch name feat/PLAT-628/...)
  • Changelog entry — deferred to the final (docs/e2e) ticket when the feature is announced; happy to add here if preferred

Notes for Reviewers

  • Gateway stays md5 (design §1). The Swarm host all all <gateway>/32 md5 rule is left as md5 to match the design ("bridge gateway MD5"); only the catch-all follows password_encryption. Heads-up: under password_encryption=scram-sha-256, host-local clients hitting the gateway rule would fail md5 auth. Conformant with the merged design — flagging as a possible follow-up rather than changing it here.
  • Common plumbing. I wired PgHbaConf/PgIdentConf through the generator's opts.Instance (consistent with how SpecParameters/PostgreSQLConf already flow) rather than the []hba.Entry ExtraHbaEntries param. This emits the user lines verbatim (no lossy re-serialization through hba.Entry.String()) and lands them at the same position. ExtraHbaEntries stays available for orchestrator-internal structured entries. Easy to switch to the literal param route if preferred.
  • State. The common PatroniConfigGenerator is serialized into resource state (json:"generator"); the new fields are omitempty, so existing databases serialize identically (no migration, no spurious diff), and databases with entries carry them in state — which correctly drives the Update → SIGHUP reload.
  • Scope. Generators + golden tests only. Docs and the e2e test land in the follow-up ticket.

Wire pg_hba_conf and pg_ident_conf from the instance spec into both the
Swarm and common/systemd Patroni config generators, so the entries now
reach the generated `pg_hba.conf` / `pg_ident.conf`. The previous change
only accepted, validated, and stored them.

- User entries form a zone after the CP's system-user and
  bridge-isolation rules and before the catch-all, so they cannot affect
  control-plane-internal connectivity. Node-level entries are already
  prepended ahead of database-level entries by `NodeInstances()`.
- The Block 3 catch-all auth method now follows `password_encryption`
  (defaults to md5 when unset), so user passwords and the fallback stay
  in the same auth landscape.
- Swarm: add the IPv6 (`::/0`) system-user reject the common path
  already had, so a permissive user rule can't reach a system user over
  IPv6 now that the user zone sits below the reject.
- Populate `PgIdent` (previously always nil); it is purely user-supplied.
- Reload is unchanged (SIGHUP); the generator just emits the entries.

Adds golden tests for both generators to guard against divergence.

PLAT-628
@moizpgedge moizpgedge requested a review from jason-lynch June 4, 2026 17:39
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Both Patroni config generators now support user-provided pg_hba and pg_ident configuration. Catch-all authentication method is computed from the password_encryption parameter, defaulting to MD5. User-supplied rules are inserted before the catch-all. Test infrastructure adds golden test support with update flag, validating SCRAM and MD5 scenarios.

Changes

User-provided pg_hba and pg_ident configuration with password-aware authentication

Layer / File(s) Summary
Common package contract and core implementation
server/internal/orchestrator/common/patroni_config_generator.go
PatroniConfigGenerator struct gains exported PgHbaConf and PgIdentConf fields with JSON tags; password auth method is computed from password_encryption parameter (defaulting to MD5); pgHba() function signature updated to accept passwordAuthMethod and refactored to insert user entries before catch-all; new pgIdent() helper returns pg_ident mappings when non-empty.
Swarm package password-aware pg_hba generation
server/internal/orchestrator/swarm/patroni_config.go
generatePatroniConfig() derives passwordAuthMethod from configured password_encryption; pg_hba assembly adds IPv6 reject rule (::/0) for system users; user PgHbaConf entries are appended after system-user rules and before catch-all whose AuthMethod uses the computed value; pg_ident conditionally included.
Golden test infrastructure and TestMain
server/internal/orchestrator/swarm/main_test.go, server/internal/orchestrator/swarm/patroni_config_golden_test.go
Swarm package TestMain registers -update flag for golden test control and runs suite; golden test framework YAML-round-trips configs for normalization before equality assertion to handle number type mismatches.
Common package test case with SCRAM scenario
server/internal/orchestrator/common/patroni_config_generator_test.go, server/internal/orchestrator/common/golden_test/TestPatroniConfigGenerator/user_pg_hba_pg_ident_and_scram.yaml
Test case validates generator with password_encryption: scram-sha-256, custom PgHbaConf and PgIdentConf; golden YAML documents resulting Patroni config with SCRAM authentication rules and SSL user mapping.
Swarm golden tests with MD5 and SCRAM outputs
server/internal/orchestrator/swarm/patroni_config_golden_test.go, server/internal/orchestrator/swarm/golden_test/TestGeneratePatroniConfig/no_user_entries_defaults_to_md5.yaml, server/internal/orchestrator/swarm/golden_test/TestGeneratePatroniConfig/user_pg_hba_pg_ident_and_scram.yaml
Table-driven test validates multiple InstanceSpec scenarios; golden outputs document MD5-default behavior (no user entries) and SCRAM scenario with user pg_hba/pg_ident, including complete DCS, PostgreSQL parameters, authentication rules, REST API, and watchdog configuration.

Poem

🐰 hops through configs with glee
Rules for users, now set free!
SCRAM and MD5, both play their part,
Authentication logic, from the start!
Golden tests guard the rabbit's heart! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: emit user pg_hba/pg_ident entries' directly reflects the main change—wiring user-supplied pg_hba and pg_ident configuration entries into the generated Patroni config files.
Description check ✅ Passed The description comprehensively covers the summary, detailed changes to both generators, testing approach, and notes for reviewers, aligning well with the provided template structure.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/PLAT-628/Generator-wiring-both-orchestrators-password_encryption-IPv6-reject

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/internal/orchestrator/swarm/patroni_config.go`:
- Around line 522-528: The IPv6 catch-all is missing: when appending the IPv4
catch-all to cfg.Postgresql.PgHba using hba.Entry{Type: hba.EntryTypeHost,
Database: "all", User: "all", Address: "0.0.0.0/0", AuthMethod:
passwordAuthMethod}.String(), also append a corresponding IPv6 entry with
Address set to "::/0" (same Type, Database, User and AuthMethod) so
cfg.Postgresql.PgHba contains both IPv4 and IPv6 catch-all rules.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2c48cac0-3d74-4ada-b13a-389f29a36d6f

📥 Commits

Reviewing files that changed from the base of the PR and between 5d283c0 and 7ede628.

📒 Files selected for processing (8)
  • server/internal/orchestrator/common/golden_test/TestPatroniConfigGenerator/user_pg_hba_pg_ident_and_scram.yaml
  • server/internal/orchestrator/common/patroni_config_generator.go
  • server/internal/orchestrator/common/patroni_config_generator_test.go
  • server/internal/orchestrator/swarm/golden_test/TestGeneratePatroniConfig/no_user_entries_defaults_to_md5.yaml
  • server/internal/orchestrator/swarm/golden_test/TestGeneratePatroniConfig/user_pg_hba_pg_ident_and_scram.yaml
  • server/internal/orchestrator/swarm/main_test.go
  • server/internal/orchestrator/swarm/patroni_config.go
  • server/internal/orchestrator/swarm/patroni_config_golden_test.go

Comment on lines +522 to +528
*cfg.Postgresql.PgHba = append(*cfg.Postgresql.PgHba, hba.Entry{
Type: hba.EntryTypeHost,
Database: "all",
User: "all",
Address: "0.0.0.0/0",
AuthMethod: passwordAuthMethod,
}.String())
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing IPv6 catch-all rule for non-system users.

The common package emits both IPv4 (0.0.0.0/0) and IPv6 (::/0) catch-all rules for non-system users (see server/internal/orchestrator/common/patroni_config_generator.go lines 455-470), but this swarm implementation only appends the IPv4 catch-all. Without the IPv6 catch-all, connections from IPv6 addresses that don't match any user-supplied rule will fall through to PostgreSQL's default deny, which may be intentional but differs from the common path.

🔧 Proposed fix to add IPv6 catch-all
 	// Catch-all for non-system users; the auth method follows password_encryption.
 	*cfg.Postgresql.PgHba = append(*cfg.Postgresql.PgHba, hba.Entry{
 		Type:       hba.EntryTypeHost,
 		Database:   "all",
 		User:       "all",
 		Address:    "0.0.0.0/0",
 		AuthMethod: passwordAuthMethod,
-	}.String())
+	}.String(), hba.Entry{
+		Type:       hba.EntryTypeHost,
+		Database:   "all",
+		User:       "all",
+		Address:    "::/0",
+		AuthMethod: passwordAuthMethod,
+	}.String())
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/internal/orchestrator/swarm/patroni_config.go` around lines 522 - 528,
The IPv6 catch-all is missing: when appending the IPv4 catch-all to
cfg.Postgresql.PgHba using hba.Entry{Type: hba.EntryTypeHost, Database: "all",
User: "all", Address: "0.0.0.0/0", AuthMethod: passwordAuthMethod}.String(),
also append a corresponding IPv6 entry with Address set to "::/0" (same Type,
Database, User and AuthMethod) so cfg.Postgresql.PgHba contains both IPv4 and
IPv6 catch-all rules.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this as is for now. The swarm catch-all has always been IPv4 only since the whole swarm pg_hba is IPv4 centric (the gateway and bridge subnets are all IPv4), and the IPv6 scope for this PR was just the ::/0 system user reject

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moizpgedge, understood — since the swarm gateway and bridge subnets are all IPv4, the catch-all only needing to cover 0.0.0.0/0 makes sense. The ::/0 system-user reject was added specifically to prevent system users from slipping through a user-supplied IPv6 rule, not to open up general IPv6 access. Thanks for the context!


✏️ Learnings added
Learnt from: moizpgedge
Repo: pgEdge/control-plane PR: 400
File: server/internal/orchestrator/swarm/patroni_config.go:522-528
Timestamp: 2026-06-07T18:06:05.879Z
Learning: In pgEdge/control-plane, the swarm Patroni config pg_hba (server/internal/orchestrator/swarm/patroni_config.go) is intentionally IPv4-centric: the gateway and bridge subnets used in the swarm stack are all IPv4, so the non-system-user catch-all rule only covers 0.0.0.0/0 (not ::/0). The ::/0 system-user reject rule was added specifically to prevent system users (pgedge, patroni_replicator) from being matched by a later permissive user-supplied IPv6 rule, not to enable general IPv6 connectivity. Do not flag the missing IPv6 catch-all for non-system users in the swarm pg_hba as a bug.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: jason-lynch
Repo: pgEdge/control-plane PR: 295
File: server/internal/orchestrator/systemd/orchestrator.go:482-489
Timestamp: 2026-04-02T15:53:45.848Z
Learning: In `server/internal/api/apiv1/validate.go`, the `validatePorts` function (around line 469) already rejects specs where the postgres port and patroni port are identical and non-zero, returning `"postgres and patroni ports must not conflict"`. It is invoked at both the database-spec level and the per-node level (after inheriting database-level port defaults into the effective node port). Therefore, `ValidateInstanceSpecs` in `server/internal/orchestrator/systemd/orchestrator.go` does not need an additional same-port check — the API validation layer enforces this upstream before the spec reaches the orchestrator.

Learnt from: rshoemaker
Repo: pgEdge/control-plane PR: 266
File: server/internal/workflows/update_database.go:122-153
Timestamp: 2026-02-09T21:47:46.904Z
Learning: In server/internal/workflows/update_database.go and related Go workflow files (e.g., provision_services.go), implement idempotent provisioning semantics: re-running ProvisionServices on already-provisioned service instances should be safe and non-destructive. Upsert operations (e.g., StoreServiceInstance) may temporarily set state to 'creating' but should recover back to 'running' via monitoring. Use SQL-level idempotence for user/service roles (e.g., CREATE ROLE IF NOT EXISTS / ALTER ROLE) and ensure deployment diffs compare desired vs. current state. These patterns support recovery when a user retries with corrected specs after a failure, and should be considered in similar Go workflow files across the repository.

Learnt from: rshoemaker
Repo: pgEdge/control-plane PR: 280
File: server/internal/orchestrator/swarm/mcp_config_resource.go:90-94
Timestamp: 2026-03-02T20:15:14.804Z
Learning: In MCP config resources (e.g., server/internal/orchestrator/swarm/mcp_config_resource.go and similar tokens.yaml/users.yaml), Refresh should only verify the existence of config.yaml. Tokens.yaml and users.yaml are owned and managed by the MCP server at runtime and should not be rewritten on Update. Do not trigger recreation that would overwrite runtime state. This ownership model is consistent with the comment indicating Do NOT touch tokens.yaml or users.yaml. Apply this guidance to all MCP config resource files in the swarm directory.

Learnt from: rshoemaker
Repo: pgEdge/control-plane PR: 280
File: server/internal/orchestrator/swarm/mcp_config.go:95-97
Timestamp: 2026-03-02T20:17:36.231Z
Learning: In the pgEdge/control-plane repository, internal (unexported) functions that are called through controlled paths by the resource framework should assume validated input from their invocation boundaries. Do not add defensive nil guards inside such internal helpers; rely on validation at the resource construction/invocation boundaries (e.g., MCPConfigResource.Create/Update) to guarantee data integrity. This guideline applies to all Go files under server/internal where internal helpers serve as implementation details behind controlled APIs.

Learnt from: rshoemaker
Repo: pgEdge/control-plane PR: 287
File: server/internal/orchestrator/swarm/service_user_role.go:59-60
Timestamp: 2026-03-11T16:05:27.239Z
Learning: In server/internal/orchestrator/swarm/service_user_role.go, treat a ResourceVersion bump (e.g., 2 -> 3) as a documentation/safety marker only. When the resource Identifier key changes (e.g., ServiceInstanceID to ServiceID), the system will automatically recreate resources due to the Type/ID key mismatch. Do not flag a ResourceVersion bump as a missing recreate trigger if an Identifier change is already causing a full recreate for that resource. This guidance is file-specific and should be applied when reviewing changes to service_user_role.go.

Learnt from: jason-lynch
Repo: pgEdge/control-plane PR: 293
File: server/internal/orchestrator/common/paths.go:79-157
Timestamp: 2026-03-24T13:24:39.229Z
Learning: In pgEdge/control-plane, when reviewing Go packages under server/ that use the samber/do injector, apply the `Provide()` requirement only to packages that expose singleton components which are consumed as dependencies by other singleton components via the injector. Do NOT require `Provide()` for pure utility/value-type packages (e.g., path helpers, data structures) that are not injected as singleton dependencies.

Learnt from: jason-lynch
Repo: pgEdge/control-plane PR: 312
File: server/internal/workflows/backend/etcd/store.go:52-58
Timestamp: 2026-03-24T19:00:21.971Z
Learning: In pgEdge/control-plane, treat errors returned from application startup/initialization functions as fatal: a startup failure should trigger the normal application shutdown path, which then calls each component’s `Shutdown`/`Stop` for cleanup. Therefore, when reviewing code in startup/initialization paths (e.g., cache/etcd initialization like `StartCaches`), do not require explicit rollback/cleanup of partially-initialized resources inside the startup function based on its error return. Avoid review suggestions that push intermediate rollback patterns for these fatal startup errors, since the existing shutdown sequence is responsible for teardown.

Learnt from: jason-lynch
Repo: pgEdge/control-plane PR: 353
File: server/internal/orchestrator/common/pgbackrest_stanza.go:26-29
Timestamp: 2026-04-20T13:57:55.700Z
Learning: In the pgEdge control-plane codebase, treat `ResourceVersion()` as a cache-busting/diff-forcing mechanism for resources. Increment `ResourceVersion()` only when making a behavioral change that requires consumers to re-fetch/re-evaluate the resource, but where no serialized fields are changed. Do not bump `ResourceVersion()` solely because a field was removed—removing serialized fields should already produce a diff on its own.

Learnt from: jason-lynch
Repo: pgEdge/control-plane PR: 353
File: server/internal/orchestrator/common/pgbackrest_stanza.go:26-29
Timestamp: 2026-04-20T13:57:55.700Z
Learning: In the pgEdge control-plane codebase, avoid using `DiffIgnore()` with JSON paths for fields that have been removed. `DiffIgnore()` suppresses diffs for the specified JSON field, which prevents the resource from being re-evaluated after a field is removed. For removed fields, do not add them to `DiffIgnore()`; instead let the diff occur naturally, relying on the idempotency of Update methods to handle the change safely.

Learnt from: jason-lynch
Repo: pgEdge/control-plane PR: 377
File: server/internal/monitor/provide.go:60-61
Timestamp: 2026-05-05T12:30:06.386Z
Learning: In pgEdge/control-plane, the storage layer (e.g., WatchOp) uses deferred execution: etcd connections are not created in constructors/initializers. When reviewing Go code, do not report “etcd resource leak” for code paths that merely construct candidate/watch/election objects (e.g., via electionSvc.NewCandidate) as long as the relevant execution method (Exec/Start or equivalent) is not called afterward. Only consider it a potential leak if the code actually starts/executes the watcher/candidate and fails to shut it down (if the project has a corresponding stop/close path).

@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 5 complexity · 2 duplication

Metric Results
Complexity 5
Duplication 2

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant