Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (12)
📒 Files selected for processing (48)
📝 WalkthroughWalkthroughReplaces singular hostname/IPv4 fields with array-based address fields across APIs, config, storage, certificates, etcd, orchestrator, monitors, workflows, tests, docs, and changelog. Adds multi-server join URLs and env-var parsing for string arrays. Updates go.mod and appends third‑party license attributions in NOTICE.txt. Changes
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 16
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/internal/api/apiv1/convert.go (1)
408-415:⚠️ Potential issue | 🟠 MajorDon't emit
connection_infowithout an address.With the new schema, a non-zero port alone no longer makes the object usable. Returning
connection_infowithaddresses: null/[]will advertise a broken endpoint as if it were ready.Proposed fix
func instanceConnectionInfoToAPI(status *database.InstanceStatus) *api.InstanceConnectionInfo { - if status == nil || status.Port == nil || *status.Port == 0 { + if status == nil || status.Port == nil || *status.Port == 0 || len(status.Addresses) == 0 { return nil } return &api.InstanceConnectionInfo{ Addresses: status.Addresses, Port: status.Port,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/api/apiv1/convert.go` around lines 408 - 415, The current instanceConnectionInfoToAPI function returns an api.InstanceConnectionInfo when Port is set even if status.Addresses is nil/empty; update instanceConnectionInfoToAPI (and keep using database.InstanceStatus and api.InstanceConnectionInfo) to also validate that status.Addresses is non-nil and contains at least one address (e.g., len(status.Addresses) > 0) and return nil if there are no addresses, so we never emit connection_info without a usable address.
🧹 Nitpick comments (5)
e2e/fixtures/roles/deploy_control_plane/templates/stack.yaml.tmpl (1)
8-9: Let this fixture exercise the new multi-address path.Line 8 and Line 9 still pull from singular inventory keys, so this e2e template only covers the one-address case even though these settings are now array-backed. Prefer accepting list-valued hostvars and serializing them explicitly here so the fixture also validates the new parsing/return shape.
Suggested template shape
- - PGEDGE_PEER_ADDRESSES={{ hostvars[host].peer_ip }} - - PGEDGE_CLIENT_ADDRESSES={{ hostvars[host].external_ip }} + - PGEDGE_PEER_ADDRESSES={{ (hostvars[host].peer_addresses | default([hostvars[host].peer_ip])) | join(',') }} + - PGEDGE_CLIENT_ADDRESSES={{ (hostvars[host].client_addresses | default([hostvars[host].external_ip])) | join(',') }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@e2e/fixtures/roles/deploy_control_plane/templates/stack.yaml.tmpl` around lines 8 - 9, Update the template lines that set PGEDGE_PEER_ADDRESSES and PGEDGE_CLIENT_ADDRESSES so they accept list-valued hostvars and serialize them into the expected array/string form instead of using singular keys; replace uses of hostvars[host].peer_ip and hostvars[host].external_ip with expressions that handle both a single value and a list (e.g., ensuring the value is treated as a list and joined/JSON-encoded as required by the runtime), so the fixture exercises the multi-address path for PGEDGE_PEER_ADDRESSES and PGEDGE_CLIENT_ADDRESSES.e2e/service_provisioning_test.go (1)
116-117: Check reachability, not just that the slice is non-empty.
assert.NotEmptyonly proves the API returned at least one string. The PR objective for this feature is that the advertised addresses are actually reachable from the host, so a regression to unusable/private-only entries would still pass this test. Consider dialing one returned address with the published HTTP port or checking the returnedconnection_infoendpoint directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@e2e/service_provisioning_test.go` around lines 116 - 117, The test currently only asserts serviceInstance.Status.Addresses is non-empty; change it to verify reachability by selecting at least one returned address from serviceInstance.Status.Addresses (or the serviceInstance.Status.ConnectionInfo/connection_info endpoint if present) and perform a real network check: attempt a TCP dial to the advertised host:port or an HTTP GET to the published HTTP port/path and assert the dial/HTTP request succeeds within a short timeout. If multiple addresses are present, iterate until one reachable address is found and fail the test if none respond; keep the existing t.Logf of addresses for debugging.server/internal/orchestrator/swarm/postgres_certs.go (1)
144-146: Include::1in the implicit SAN set.The new address model supports IPv6, but this fallback only guarantees
127.0.0.1andlocalhost. Local TLS clients that use the IPv6 loopback literal will still fail verification.Proposed tweak
- combined.Add("127.0.0.1", "localhost") + combined.Add("127.0.0.1", "::1", "localhost")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/orchestrator/swarm/postgres_certs.go` around lines 144 - 146, The implicit SAN set created with ds.NewSet(c.InstanceAddresses...) and stored in combined only adds "127.0.0.1" and "localhost", so IPv6 loopback clients will fail; update the combined.Add call on the combined variable to also include the IPv6 loopback literal "::1" (i.e., call combined.Add("127.0.0.1", "localhost", "::1")) so the generated certs include the IPv6 localhost SAN.server/internal/etcd/embedded_test.go (1)
88-93: These lifecycle tests still don't hit the split-network path.Every fixture keeps
PeerAddressesandClientAddressesidentical, and the join flow only passes the peer list into host credential generation. That means the suite can stay green even if client-facing SANs/endpoints are wrong. Please add at least one case where the two lists differ, e.g. peer=127.0.0.1, client=localhost.Also applies to: 117-123, 137-140, 216-220, 232-237, 245-250, 262-265, 278-281, 372-376
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/etcd/embedded_test.go` around lines 88 - 93, Test fixtures create config.Config instances where PeerAddresses and ClientAddresses are identical, so the tests never exercise split-network behavior; update each affected fixture (examples include variables like cfgA, cfgB, or other test configs used in the join flow) to include at least one case where PeerAddresses and ClientAddresses differ (e.g. PeerAddresses: []string{"127.0.0.1"}, ClientAddresses: []string{"localhost"}) so host credential generation and client-facing SANs/endpoints are validated; search for test configs that set PeerAddresses == ClientAddresses and change at least one of those instances to use differing values across the suite.server/internal/etcd/remote_test.go (1)
27-28: Differentiate the peer/client fixtures in this test.Both configs use the same values for
PeerAddressesandClientAddresses, so a regression that swaps those code paths will still pass. Using different loopback-equivalent values here would actually exercise the split this PR introduces.Also applies to: 95-96
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/etcd/remote_test.go` around lines 27 - 28, The test fixtures currently set identical values for PeerAddresses and ClientAddresses which hides regressions that swap those code paths; update the two occurrences where PeerAddresses and ClientAddresses are set (fields PeerAddresses and ClientAddresses in remote_test.go) to use distinct loopback-equivalent values (for example make PeerAddresses use "127.0.0.1" and ClientAddresses use "::1" or "localhost" and "127.0.0.1") so the test exercises the split; apply the same change to both places referenced in the file (the block around the first fixture and the block around the second fixture).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/apiv1/design/cluster.go`:
- Around line 71-86: Update the "addresses" attribute schema
(g.Attribute("addresses", ...)) to require at least one entry, allow FQDNs up to
253 chars, and validate element format: add g.MinItems(1) on the array, increase
the per-element g.MaxLength to 253 (instead of 128) and add a format/pattern
check on each g.Elem to accept hostname/FQDN or IP addresses (use the DSL's
g.Format or g.Pattern inside g.Elem). Keep the attribute name and examples
unchanged so certService.PostgresServer() / etcd endpoint usage still match.
- Around line 50-56: Add element- and array-level validation to the
ClusterJoinToken attribute server_urls so malformed or empty URL lists are
rejected by Goa: inside the server_urls block use g.Elem(...) to apply per-item
constraints (e.g., g.Format("uri") or a URL regex) and add array constraints
such as MinLength(1) (and optional MaxLength) to the array definition to mirror
the validation used for addresses in ClusterJoinRequest; update the server_urls
attribute to include these g.Elem and MinLength/MaxLength calls so Goa validates
both items and list length.
In `@api/apiv1/design/host.go`:
- Around line 115-124: The peer_addresses and client_addresses array attributes
currently only use g.Required which permits empty arrays; add g.MinLength(1) to
each g.Attribute definition (the ones named "peer_addresses" and
"client_addresses" in this file, including the duplicate at lines noted in the
comment) so Goa v3 emits minItems: 1 and prevents empty address lists; update
both attribute blocks to include g.MinLength(1) inside their func bodies.
In `@docs/development/supported-services.md`:
- Around line 698-701: Appendix A.6 still shows deprecated fields
serviceInstance.Status.Hostname and serviceInstance.Status.Ipv4Address; update
that sample to use the new serviceInstance.Status.Addresses (or status.addresses
in JSON examples) instead. Replace accesses to Hostname/Ipv4Address with code
that reads the Addresses array (e.g., take the first entry or filter by type if
your model includes types) and adjust examples and text to show the addresses
array format shown earlier ("addresses":
["10.0.1.5","mcp-server-host-1.internal"]). Ensure all references in Appendix
A.6 and nearby text consistently use Status.Addresses/status.addresses and
remove mentions of the old fields.
In `@docs/installation/configuration.md`:
- Around line 12-17: Update the environment-variable examples and any table
entries that use the PGEDGE_* prefix to use the repo convention CONTROL_PLANE_
(e.g., replace PGEDGE_CLIENT_ADDRESSES with CONTROL_PLANE_CLIENT_ADDRESSES) and
ensure the docs explicitly state the configuration precedence: CLI flags >
environment variables > config file; also update references to peer_addresses
and client_addresses to show the CONTROL_PLANE_ prefixed env-var names
consistently throughout the section (lines 26-57) so users can set these from
the environment correctly.
In `@e2e/database_test.go`:
- Around line 235-239: The test currently uses only
instance.ConnectionInfo.Addresses[0] when building the DSN (creating dsn via
fmt.Sprintf), which breaks multi-address testing; update the logic that
builds/returns the connection string to iterate over
instance.ConnectionInfo.Addresses and attempt each advertised address (using
*instance.ConnectionInfo.Port and d.Spec.DatabaseName and opts.Username) until a
successful connection/health check is established, and only then return that
working DSN (or aggregate the attempts into a meaningful error if none succeed).
In `@server/internal/api/apiv1/post_init_handlers.go`:
- Around line 75-89: GetServerURLs builds host:port strings using fmt.Sprintf
which breaks for IPv6 literals; update it to use net.JoinHostPort for proper
IPv6/IPv4/hostname handling: replace fmt.Sprintf("%s:%d", a, cfg.HTTP.Port) with
net.JoinHostPort(a, strconv.Itoa(cfg.HTTP.Port)) (import net and strconv) when
constructing the url.URL Host field so cfg.PeerAddresses and cfg.HTTP.Port are
formatted correctly; keep the rest of the function (scheme selection, url.URL
creation, and appending to urls) unchanged.
In `@server/internal/config/config.go`:
- Around line 252-257: The Validate() function currently only checks slice
lengths; update it to iterate over c.PeerAddresses and c.ClientAddresses and
validate each entry is non-empty and a syntactically valid host/IP (reject blank
strings and obviously malformed hosts like "https://..."); for each invalid
entry append a descriptive error to errs (e.g., referencing the offending value
and whether it failed host/IP parsing) so that Validate() returns precise
failures for bad peer/client addresses rather than letting later
URL/endpoint/cert construction fail (reference c.PeerAddresses,
c.ClientAddresses, and Validate()).
In `@server/internal/config/sources.go`:
- Around line 41-43: The env provider was changed to use "PGEDGE_" and should
continue to use the documented "CONTROL_PLANE_" prefix; update the call to
env.ProviderWithValue so it uses "CONTROL_PLANE_" (and keep the existing key
normalization logic that trims the prefix and lowercases the key), ensuring the
environment-source logic in sources.go still exposes variables like
CONTROL_PLANE_* (refer to the env.ProviderWithValue invocation and the key
normalization lines where key is trimmed and lowercased).
- Around line 46-48: The current slice parsing in the switch case for []string
uses strings.Split(value, ",") which preserves empty entries and surrounding
whitespace; update the logic around k.Get(key) and the []string branch so you
split on ',', then for each part call strings.TrimSpace and only append
non-empty results (dropping "" and whitespace-only items) before returning the
key and the cleaned []string; ensure the updated code still returns the same
types expected by the caller.
In `@server/internal/database/instance.go`:
- Around line 79-83: When constructing InstanceStatus ensure Addresses is
initialized to an empty slice so JSON serializes as [] instead of null; update
the code that creates the InstanceStatus (the constructor/instantiation used
before early returns) to set Addresses to make([]string,0) or []string{} on the
InstanceStatus struct, or alternatively add `omitempty` to the
`json:"addresses"` tag on the Addresses field in the InstanceStatus type—use the
InstanceStatus and Addresses identifiers to locate the change.
In `@server/internal/etcd/embedded.go`:
- Around line 333-352: ClientEndpoints() and PeerEndpoints() build host:port
incorrectly for IPv6 and the config also hardcodes
ListenClientUrls/ListenPeerUrls to 0.0.0.0; update these to use net.JoinHostPort
(or set url.URL.Host using JoinHostPort) when formatting addresses so IPv6
literals are bracketed, and modify AdvertiseClientUrls, AdvertisePeerUrls,
InitialCluster, ListenClientUrls and ListenPeerUrls construction to include an
IPv6 listen address (e.g., :: or [::]) whenever any entry in
cfg.Config().PeerAddresses is an IPv6 literal; locate and change the functions
ClientEndpoints, PeerEndpoints and the code that sets AdvertiseClientUrls,
AdvertisePeerUrls, InitialCluster, ListenClientUrls, ListenPeerUrls to use
net.JoinHostPort and to add IPv6 listeners conditionally.
In `@server/internal/etcd/rbac.go`:
- Around line 55-56: The code currently passes only a single addresses slice
(opts.Addresses) into addEtcdServerCredentials and into EtcdServer, but the
server cert must cover both PeerAddresses and ClientAddresses; change both call
sites (the one around addEtcdServerCredentials and the similar block at the
later occurrence) to compute the union (and deduplicate) of opts.PeerAddresses
and opts.ClientAddresses and pass that union slice instead of opts.Addresses so
the created creds/etcd-server.crt includes all advertised listener addresses
(update references to addEtcdServerCredentials, EtcdServer, creds,
PeerAddresses, ClientAddresses accordingly).
In `@server/internal/etcd/remote.go`:
- Line 285: The server cert creation call to addEtcdServerCredentials only
passes cfg.PeerAddresses, which omits any client-only addresses from the SANs;
update the call site and/or addEtcdServerCredentials to include both peer and
client addresses (e.g., combine cfg.PeerAddresses and cfg.ClientAddresses into
the SAN list passed to addEtcdServerCredentials for HostID cfg.HostID), ensuring
duplicates are deduplicated and all client-facing URLs are included so the
promoted etcd instance's certificate covers advertised client URLs.
In `@server/internal/host/host_store.go`:
- Around line 22-23: StoredHost serialization changed field names and will break
reads from etcd; implement a custom JSON unmarshal for StoredHost (func (s
*StoredHost) UnmarshalJSON(data []byte) error) that accepts both the new keys
("peer_addresses", "client_addresses") and the legacy keys ("hostname",
"ipv4_address") and populates PeerAddresses and ClientAddresses accordingly,
preferring the new fields when present; keep existing struct field names/tags
and ensure the Unmarshal handles single string legacy values by converting them
to single-element slices so old records remain readable without a migration.
---
Outside diff comments:
In `@server/internal/api/apiv1/convert.go`:
- Around line 408-415: The current instanceConnectionInfoToAPI function returns
an api.InstanceConnectionInfo when Port is set even if status.Addresses is
nil/empty; update instanceConnectionInfoToAPI (and keep using
database.InstanceStatus and api.InstanceConnectionInfo) to also validate that
status.Addresses is non-nil and contains at least one address (e.g.,
len(status.Addresses) > 0) and return nil if there are no addresses, so we never
emit connection_info without a usable address.
---
Nitpick comments:
In `@e2e/fixtures/roles/deploy_control_plane/templates/stack.yaml.tmpl`:
- Around line 8-9: Update the template lines that set PGEDGE_PEER_ADDRESSES and
PGEDGE_CLIENT_ADDRESSES so they accept list-valued hostvars and serialize them
into the expected array/string form instead of using singular keys; replace uses
of hostvars[host].peer_ip and hostvars[host].external_ip with expressions that
handle both a single value and a list (e.g., ensuring the value is treated as a
list and joined/JSON-encoded as required by the runtime), so the fixture
exercises the multi-address path for PGEDGE_PEER_ADDRESSES and
PGEDGE_CLIENT_ADDRESSES.
In `@e2e/service_provisioning_test.go`:
- Around line 116-117: The test currently only asserts
serviceInstance.Status.Addresses is non-empty; change it to verify reachability
by selecting at least one returned address from serviceInstance.Status.Addresses
(or the serviceInstance.Status.ConnectionInfo/connection_info endpoint if
present) and perform a real network check: attempt a TCP dial to the advertised
host:port or an HTTP GET to the published HTTP port/path and assert the
dial/HTTP request succeeds within a short timeout. If multiple addresses are
present, iterate until one reachable address is found and fail the test if none
respond; keep the existing t.Logf of addresses for debugging.
In `@server/internal/etcd/embedded_test.go`:
- Around line 88-93: Test fixtures create config.Config instances where
PeerAddresses and ClientAddresses are identical, so the tests never exercise
split-network behavior; update each affected fixture (examples include variables
like cfgA, cfgB, or other test configs used in the join flow) to include at
least one case where PeerAddresses and ClientAddresses differ (e.g.
PeerAddresses: []string{"127.0.0.1"}, ClientAddresses: []string{"localhost"}) so
host credential generation and client-facing SANs/endpoints are validated;
search for test configs that set PeerAddresses == ClientAddresses and change at
least one of those instances to use differing values across the suite.
In `@server/internal/etcd/remote_test.go`:
- Around line 27-28: The test fixtures currently set identical values for
PeerAddresses and ClientAddresses which hides regressions that swap those code
paths; update the two occurrences where PeerAddresses and ClientAddresses are
set (fields PeerAddresses and ClientAddresses in remote_test.go) to use distinct
loopback-equivalent values (for example make PeerAddresses use "127.0.0.1" and
ClientAddresses use "::1" or "localhost" and "127.0.0.1") so the test exercises
the split; apply the same change to both places referenced in the file (the
block around the first fixture and the block around the second fixture).
In `@server/internal/orchestrator/swarm/postgres_certs.go`:
- Around line 144-146: The implicit SAN set created with
ds.NewSet(c.InstanceAddresses...) and stored in combined only adds "127.0.0.1"
and "localhost", so IPv6 loopback clients will fail; update the combined.Add
call on the combined variable to also include the IPv6 loopback literal "::1"
(i.e., call combined.Add("127.0.0.1", "localhost", "::1")) so the generated
certs include the IPv6 localhost SAN.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bbfa9985-e08d-4b0d-bd7b-8c777e4dddf7
⛔ Files ignored due to path filters (12)
api/apiv1/gen/control_plane/service.gois excluded by!**/gen/**api/apiv1/gen/http/cli/control_plane/cli.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/cli.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/encode_decode.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/types.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/server/encode_decode.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/server/types.gois excluded by!**/gen/**api/apiv1/gen/http/openapi.jsonis excluded by!**/gen/**api/apiv1/gen/http/openapi.yamlis excluded by!**/gen/**api/apiv1/gen/http/openapi3.jsonis excluded by!**/gen/**api/apiv1/gen/http/openapi3.yamlis excluded by!**/gen/**go.sumis excluded by!**/*.sum
📒 Files selected for processing (47)
NOTICE.txtapi/apiv1/design/cluster.goapi/apiv1/design/database.goapi/apiv1/design/host.goapi/apiv1/design/instance.gochanges/unreleased/Added-20260306-160121.yamlchanges/unreleased/Changed-20260306-155827.yamlchanges/unreleased/Changed-20260306-160015.yamldocker/control-plane-dev/config.jsondocs/development/supported-services.mddocs/installation/configuration.mde2e/database_test.goe2e/fixtures/roles/deploy_control_plane/templates/stack.yaml.tmple2e/service_provisioning_test.gogo.modhack/dev-env.zshserver/cmd/root.goserver/internal/api/apiv1/convert.goserver/internal/api/apiv1/post_init_handlers.goserver/internal/api/apiv1/pre_init_handlers.goserver/internal/certificates/service.goserver/internal/certificates/templates.goserver/internal/config/config.goserver/internal/config/sources.goserver/internal/database/instance.goserver/internal/database/orchestrator.goserver/internal/database/service.goserver/internal/database/service_instance.goserver/internal/etcd/embedded.goserver/internal/etcd/embedded_test.goserver/internal/etcd/interface.goserver/internal/etcd/rbac.goserver/internal/etcd/remote.goserver/internal/etcd/remote_test.goserver/internal/host/host.goserver/internal/host/host_store.goserver/internal/host/service.goserver/internal/monitor/instance_monitor.goserver/internal/monitor/service.goserver/internal/orchestrator/swarm/orchestrator.goserver/internal/orchestrator/swarm/postgres_certs.goserver/internal/orchestrator/swarm/service_user_role.goserver/internal/workflows/activities/check_cluster_health.goserver/internal/workflows/activities/failover.goserver/internal/workflows/activities/get_primary_instance.goserver/internal/workflows/activities/restart_instance.goserver/internal/workflows/activities/switchover.go
💤 Files with no reviewable changes (2)
- hack/dev-env.zsh
- server/internal/monitor/service.go
2ba6cc0 to
1aa1b8b
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (5)
docs/installation/configuration.md (1)
12-17:⚠️ Potential issue | 🟡 MinorState config-source precedence in this section.
This new env-var guidance still leaves it unclear which source wins when the same setting is provided in multiple places. Please spell out the repo’s precedence here: CLI flags > environment variables > config file.
Suggested wording
Use a comma-separated string to specify string array properties, such as `client_addresses`, via an environment variable, e.g.: ```sh PGEDGE_CLIENT_ADDRESSES='192.168.1.2,my-host.internal'
+When the same setting is provided in multiple places, precedence is:
+CLI flags > environment variables > config file.</details> Based on learnings: Multi-source configuration follows precedence: CLI flags > environment variables > config file. <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/installation/configuration.mdaround lines 12 - 17, The docs section
that shows using PGEDGE_CLIENT_ADDRESSES as a comma-separated env var is missing
the repo's config-source precedence; update the same section to explicitly state
that when a setting is provided in multiple places the precedence is: CLI flagsenvironment variables > config file, and add this sentence near the
PGEDGE_CLIENT_ADDRESSES example so readers see the precedence immediately after
the env-var guidance.</details> </blockquote></details> <details> <summary>server/internal/api/apiv1/post_init_handlers.go (1)</summary><blockquote> `75-90`: _⚠️ Potential issue_ | _🟠 Major_ **Use `net.JoinHostPort` for IPv6-safe URL construction.** The current `fmt.Sprintf("%s:%d", a, cfg.HTTP.Port)` produces invalid URLs for IPv6 addresses (e.g., `2001:db8::1:8080` instead of `[2001:db8::1]:8080`). <details> <summary>Suggested fix</summary> ```diff +import ( + "net" + "strconv" +) func GetServerURLs(cfg config.Config) []string { scheme := "http" if cfg.HTTP.ServerCert != "" && cfg.HTTP.ServerKey != "" { scheme = "https" } var urls []string for _, a := range cfg.PeerAddresses { u := url.URL{ Scheme: scheme, - Host: fmt.Sprintf("%s:%d", a, cfg.HTTP.Port), + Host: net.JoinHostPort(a, strconv.Itoa(cfg.HTTP.Port)), } urls = append(urls, u.String()) } return urls }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/api/apiv1/post_init_handlers.go` around lines 75 - 90, GetServerURLs builds host:port incorrectly for IPv6; update the Host field construction in GetServerURLs to use net.JoinHostPort(a, strconv.Itoa(cfg.HTTP.Port)) (import net and strconv) instead of fmt.Sprintf("%s:%d", a, cfg.HTTP.Port) so IPv6 addresses are wrapped in brackets and URLs are valid; keep the existing scheme selection logic and append u.String() as before.server/internal/config/sources.go (1)
46-48:⚠️ Potential issue | 🟡 MinorTrim whitespace and filter empty entries when parsing comma-separated env values.
Empty env vars become
[]string{""}and values like"a, b"preserve leading spaces on subsequent items. For address fields, this produces invalid entries.Suggested fix
switch k.Get(key).(type) { case []string: - return key, strings.Split(value, ",") + if strings.TrimSpace(value) == "" { + return key, []string{} + } + parts := strings.Split(value, ",") + result := make([]string, 0, len(parts)) + for _, p := range parts { + if trimmed := strings.TrimSpace(p); trimmed != "" { + result = append(result, trimmed) + } + } + return key, result }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/config/sources.go` around lines 46 - 48, The CSV parsing branch (case []string) currently returns strings.Split(value, ",") which leaves empty elements and untrimmed whitespace; update that branch (where k.Get(key) is checked for []string) to split on commas, trim spaces from each item, and filter out empty strings before returning so env values like "a, b" yield ["a","b"] and "" yields [] (you can implement a small helper used here to map/trim and drop empty entries).server/internal/etcd/embedded.go (1)
338-349:⚠️ Potential issue | 🟠 MajorIPv6 peer addresses still generate invalid etcd URLs.
These paths still assemble
host:portwith rawfmt.Sprintf, so an IPv6 literal ends up as an invalid endpoint / initial-cluster entry. This config path also needs an IPv6-capable listen strategy when any peer address is IPv6. Usenet.JoinHostPort()for every advertised endpoint.Quick verification: this prints the current formatting sites and shows the raw IPv6 form failing while the bracketed form parses cleanly.
#!/bin/bash set -euo pipefail printf 'Relevant code:\n' sed -n '333,352p;667,687p' server/internal/etcd/embedded.go python - <<'PY' from urllib.parse import urlsplit samples = [ "https://2001:db8::1:2379", "https://[2001:db8::1]:2379", ] for sample in samples: try: parts = urlsplit(sample) print(f"{sample} -> hostname={parts.hostname!r}, port={parts.port!r}") except Exception as exc: print(f"{sample} -> ERROR: {exc}") PYAlso applies to: 673-687
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/etcd/embedded.go` around lines 338 - 349, The PeerEndpoints() method constructs endpoints with fmt.Sprintf("https://%s:%d", address, peerPort) which breaks for IPv6 literals; change the assembly to use net.JoinHostPort(address, strconv.Itoa(peerPort)) (e.g. "https://"+net.JoinHostPort(...)) so IPv6 addresses are bracketed correctly, and do the same for the other endpoint construction sites (the other function using the same fmt.Sprintf pattern). Add imports for net and strconv and update the endpoints assignment in PeerEndpoints (and the duplicated site referenced in the comment) to use JoinHostPort instead of formatting host:port directly.server/internal/config/config.go (1)
252-257:⚠️ Potential issue | 🟠 MajorValidate each peer/client address entry, not just slice length.
Validate()currently accepts values like[]string{""},[]string{"https://host"}, or[]string{"host:1234"}, and later code assumes every item is a bare host/FQDN/IP value when it builds URLs and SANs. Reject malformed entries here so config errors fail early.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/config/config.go` around lines 252 - 257, In Validate(), after checking the slice lengths, iterate each entry in c.PeerAddresses and c.ClientAddresses and reject malformed items instead of only checking length: ensure each entry is non-empty, does not start with "http://" or "https://", does not contain a path (no '/'), and is either a valid IP (net.ParseIP succeeds) or matches a hostname pattern (use a hostname regex: labels of [A-Za-z0-9-] separated by dots, no leading/trailing hyphen) and does not include a port (if strings.Contains(entry, ":") and net.ParseIP(entry) is false, treat as invalid). Return an error (append to errs) referencing the offending entry so invalid host/scheme/port combinations fail fast in Validate().
🧹 Nitpick comments (2)
server/internal/orchestrator/swarm/postgres_certs.go (1)
144-150: Extract server-address normalization into one helper.This exact
ds.NewSet(...); Add("127.0.0.1", "localhost"); ToSortedSlice(strings.Compare)flow now exists here and inserver/internal/etcd/rbac.go. Pulling it into one helper would keep SAN generation rules from drifting between Postgres and etcd certificate paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/orchestrator/swarm/postgres_certs.go` around lines 144 - 150, Extract the repeated address-normalization flow (ds.NewSet(c.InstanceAddresses...); Add("127.0.0.1", "localhost"); ToSortedSlice(strings.Compare)) into a single helper function (e.g., normalizeServerAddresses or buildSANs) that accepts the slice of addresses and returns the sorted slice used for SANs; replace the inline usage in PostgresServer call in postgres_certs.go (where combined is constructed before certService.PostgresServer) and the similar occurrence in server/internal/etcd/rbac.go with calls to this new helper so both PostgresServer and the etcd RBAC certificate generation use the exact same SAN normalization logic.server/internal/etcd/interface.go (1)
23-26: Prefer a peer-specific name for this field.
HostCredentialOptions.Addressesis generic, but these values feed etcd server certificate SANs and must stay peer-only. Renaming this toPeerAddresses(or documenting that invariant here) would make the contract harder to misuse.Based on learnings: the embedded etcd server is strictly intra-cluster and its TLS certificate only needs to cover PeerAddresses.Possible cleanup
type HostCredentialOptions struct { HostID string - Addresses []string + PeerAddresses []string EmbeddedEtcdEnabled bool }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/etcd/interface.go` around lines 23 - 26, Rename HostCredentialOptions.Addresses to HostCredentialOptions.PeerAddresses to make the peer-only invariant explicit; update the struct field name and any associated tags (JSON/YAML) and then refactor all usages (constructors, callers, marshaling/unmarshaling, tests, and documentation) to reference PeerAddresses instead of Addresses, ensuring any code that populates certificate SANs reads from PeerAddresses for embedded etcd TLS generation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docker/control-plane-ci/docker-compose.yaml`:
- Around line 7-9: The docker-compose environment for host-1 is missing
PGEDGE_PEER_ADDRESSES which causes server/internal/config/config.go validation
(peer_addresses) to fail; add a PGEDGE_PEER_ADDRESSES environment variable
alongside PGEDGE_CLIENT_ADDRESSES in the docker-compose.yaml service for host-1
(use the same address format as PGEDGE_CLIENT_ADDRESSES, e.g., 127.0.0.1 or
appropriate peer list) so that the peer_addresses config is populated at
startup.
---
Duplicate comments:
In `@docs/installation/configuration.md`:
- Around line 12-17: The docs section that shows using PGEDGE_CLIENT_ADDRESSES
as a comma-separated env var is missing the repo's config-source precedence;
update the same section to explicitly state that when a setting is provided in
multiple places the precedence is: CLI flags > environment variables > config
file, and add this sentence near the PGEDGE_CLIENT_ADDRESSES example so readers
see the precedence immediately after the env-var guidance.
In `@server/internal/api/apiv1/post_init_handlers.go`:
- Around line 75-90: GetServerURLs builds host:port incorrectly for IPv6; update
the Host field construction in GetServerURLs to use net.JoinHostPort(a,
strconv.Itoa(cfg.HTTP.Port)) (import net and strconv) instead of
fmt.Sprintf("%s:%d", a, cfg.HTTP.Port) so IPv6 addresses are wrapped in brackets
and URLs are valid; keep the existing scheme selection logic and append
u.String() as before.
In `@server/internal/config/config.go`:
- Around line 252-257: In Validate(), after checking the slice lengths, iterate
each entry in c.PeerAddresses and c.ClientAddresses and reject malformed items
instead of only checking length: ensure each entry is non-empty, does not start
with "http://" or "https://", does not contain a path (no '/'), and is either a
valid IP (net.ParseIP succeeds) or matches a hostname pattern (use a hostname
regex: labels of [A-Za-z0-9-] separated by dots, no leading/trailing hyphen) and
does not include a port (if strings.Contains(entry, ":") and net.ParseIP(entry)
is false, treat as invalid). Return an error (append to errs) referencing the
offending entry so invalid host/scheme/port combinations fail fast in
Validate().
In `@server/internal/config/sources.go`:
- Around line 46-48: The CSV parsing branch (case []string) currently returns
strings.Split(value, ",") which leaves empty elements and untrimmed whitespace;
update that branch (where k.Get(key) is checked for []string) to split on
commas, trim spaces from each item, and filter out empty strings before
returning so env values like "a, b" yield ["a","b"] and "" yields [] (you can
implement a small helper used here to map/trim and drop empty entries).
In `@server/internal/etcd/embedded.go`:
- Around line 338-349: The PeerEndpoints() method constructs endpoints with
fmt.Sprintf("https://%s:%d", address, peerPort) which breaks for IPv6 literals;
change the assembly to use net.JoinHostPort(address, strconv.Itoa(peerPort))
(e.g. "https://"+net.JoinHostPort(...)) so IPv6 addresses are bracketed
correctly, and do the same for the other endpoint construction sites (the other
function using the same fmt.Sprintf pattern). Add imports for net and strconv
and update the endpoints assignment in PeerEndpoints (and the duplicated site
referenced in the comment) to use JoinHostPort instead of formatting host:port
directly.
---
Nitpick comments:
In `@server/internal/etcd/interface.go`:
- Around line 23-26: Rename HostCredentialOptions.Addresses to
HostCredentialOptions.PeerAddresses to make the peer-only invariant explicit;
update the struct field name and any associated tags (JSON/YAML) and then
refactor all usages (constructors, callers, marshaling/unmarshaling, tests, and
documentation) to reference PeerAddresses instead of Addresses, ensuring any
code that populates certificate SANs reads from PeerAddresses for embedded etcd
TLS generation.
In `@server/internal/orchestrator/swarm/postgres_certs.go`:
- Around line 144-150: Extract the repeated address-normalization flow
(ds.NewSet(c.InstanceAddresses...); Add("127.0.0.1", "localhost");
ToSortedSlice(strings.Compare)) into a single helper function (e.g.,
normalizeServerAddresses or buildSANs) that accepts the slice of addresses and
returns the sorted slice used for SANs; replace the inline usage in
PostgresServer call in postgres_certs.go (where combined is constructed before
certService.PostgresServer) and the similar occurrence in
server/internal/etcd/rbac.go with calls to this new helper so both
PostgresServer and the etcd RBAC certificate generation use the exact same SAN
normalization logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 622e52cf-82fa-4ca7-9c5c-bf06f2d901b1
⛔ Files ignored due to path filters (12)
api/apiv1/gen/control_plane/service.gois excluded by!**/gen/**api/apiv1/gen/http/cli/control_plane/cli.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/cli.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/encode_decode.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/types.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/server/encode_decode.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/server/types.gois excluded by!**/gen/**api/apiv1/gen/http/openapi.jsonis excluded by!**/gen/**api/apiv1/gen/http/openapi.yamlis excluded by!**/gen/**api/apiv1/gen/http/openapi3.jsonis excluded by!**/gen/**api/apiv1/gen/http/openapi3.yamlis excluded by!**/gen/**go.sumis excluded by!**/*.sum
📒 Files selected for processing (48)
NOTICE.txtapi/apiv1/design/cluster.goapi/apiv1/design/database.goapi/apiv1/design/host.goapi/apiv1/design/instance.gochanges/unreleased/Added-20260306-160121.yamlchanges/unreleased/Changed-20260306-155827.yamlchanges/unreleased/Changed-20260306-160015.yamldocker/control-plane-ci/docker-compose.yamldocker/control-plane-dev/config.jsondocs/development/supported-services.mddocs/installation/configuration.mde2e/database_test.goe2e/fixtures/roles/deploy_control_plane/templates/stack.yaml.tmple2e/service_provisioning_test.gogo.modhack/dev-env.zshserver/cmd/root.goserver/internal/api/apiv1/convert.goserver/internal/api/apiv1/post_init_handlers.goserver/internal/api/apiv1/pre_init_handlers.goserver/internal/certificates/service.goserver/internal/certificates/templates.goserver/internal/config/config.goserver/internal/config/sources.goserver/internal/database/instance.goserver/internal/database/orchestrator.goserver/internal/database/service.goserver/internal/database/service_instance.goserver/internal/etcd/embedded.goserver/internal/etcd/embedded_test.goserver/internal/etcd/interface.goserver/internal/etcd/rbac.goserver/internal/etcd/remote.goserver/internal/etcd/remote_test.goserver/internal/host/host.goserver/internal/host/host_store.goserver/internal/host/service.goserver/internal/monitor/instance_monitor.goserver/internal/monitor/service.goserver/internal/orchestrator/swarm/orchestrator.goserver/internal/orchestrator/swarm/postgres_certs.goserver/internal/orchestrator/swarm/service_user_role.goserver/internal/workflows/activities/check_cluster_health.goserver/internal/workflows/activities/failover.goserver/internal/workflows/activities/get_primary_instance.goserver/internal/workflows/activities/restart_instance.goserver/internal/workflows/activities/switchover.go
💤 Files with no reviewable changes (2)
- hack/dev-env.zsh
- server/internal/monitor/service.go
✅ Files skipped from review due to trivial changes (1)
- changes/unreleased/Changed-20260306-155827.yaml
🚧 Files skipped from review as they are similar to previous changes (19)
- server/internal/database/service.go
- server/internal/workflows/activities/failover.go
- server/internal/workflows/activities/get_primary_instance.go
- server/internal/database/service_instance.go
- server/internal/host/host.go
- server/internal/certificates/templates.go
- server/cmd/root.go
- server/internal/workflows/activities/switchover.go
- e2e/service_provisioning_test.go
- server/internal/workflows/activities/check_cluster_health.go
- server/internal/workflows/activities/restart_instance.go
- server/internal/api/apiv1/pre_init_handlers.go
- docs/development/supported-services.md
- api/apiv1/design/host.go
- server/internal/database/orchestrator.go
- docker/control-plane-dev/config.json
- server/internal/host/service.go
- e2e/fixtures/roles/deploy_control_plane/templates/stack.yaml.tmpl
- changes/unreleased/Changed-20260306-160015.yaml
1aa1b8b to
06352a5
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
server/internal/config/config.go (1)
364-370:⚠️ Potential issue | 🟠 MajorDefault discovery still excludes IPv6-only hosts.
defaultAddresses()only succeeds whengetFirstIP()finds an IPv4 address. On an IPv6-only host, this now returns an error even though the new address model accepts IPv6 endpoints. Prefer the first usable IPv6 address when no IPv4 address is available.Also applies to: 373-415
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/config/config.go` around lines 364 - 370, defaultAddresses() currently returns an error when getFirstIP() fails (because it only finds IPv4), which breaks IPv6-only hosts; change defaultAddresses() to try getFirstIP() for an IPv4 first, and if that returns a not-found/no-IPv4 error, scan for the first usable IPv6 address (using the same network/interface address enumeration logic used elsewhere in the file) and return that IPv6 string instead of an error; update any related helper logic referenced in the file (the getFirstIP() call sites and the address-selection block around lines 373-415) to ensure IPv6-only endpoints are accepted and returned by defaultAddresses().server/internal/etcd/embedded.go (1)
671-684:⚠️ Potential issue | 🟠 MajorAdvertised IPv6 peer URLs still have no listener.
AdvertiseClientUrls/AdvertisePeerUrlsnow publish everycfg.PeerAddressesentry, butListenClientUrlsandListenPeerUrlsstill bind only0.0.0.0. Ifpeer_addressescontains an IPv6 literal, this node will advertisehttps://[addr]:portand never accept traffic there.Suggested direction
- c.ListenClientUrls = []url.URL{ - {Scheme: "https", Host: fmt.Sprintf("0.0.0.0:%d", clientPort)}, - } - c.ListenPeerUrls = []url.URL{ - {Scheme: "https", Host: fmt.Sprintf("0.0.0.0:%d", peerPort)}, - } + listenHosts := []string{"0.0.0.0"} + for _, address := range cfg.PeerAddresses { + if ip := net.ParseIP(address); ip != nil && ip.To4() == nil { + listenHosts = append(listenHosts, "::") + break + } + } + c.ListenClientUrls = make([]url.URL, 0, len(listenHosts)) + c.ListenPeerUrls = make([]url.URL, 0, len(listenHosts)) + for _, host := range listenHosts { + c.ListenClientUrls = append(c.ListenClientUrls, url.URL{ + Scheme: "https", + Host: net.JoinHostPort(host, strconv.Itoa(clientPort)), + }) + c.ListenPeerUrls = append(c.ListenPeerUrls, url.URL{ + Scheme: "https", + Host: net.JoinHostPort(host, strconv.Itoa(peerPort)), + }) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/etcd/embedded.go` around lines 671 - 684, The code advertises every cfg.PeerAddresses entry via c.AdvertiseClientUrls/c.AdvertisePeerUrls but still only binds 0.0.0.0 in c.ListenClientUrls/c.ListenPeerUrls, so IPv6 literals will be advertised but not listened on; update the listener construction to mirror the advertise logic: build ListenClientUrls and ListenPeerUrls from cfg.PeerAddresses (use url.URL{Scheme:"https", Host: net.JoinHostPort(address, strconv.Itoa(clientPort))} and similar for peerPort) instead of hardcoding 0.0.0.0, ensuring IPv6 addresses are included. Apply changes around the code that sets c.ListenClientUrls, c.ListenPeerUrls, clientURLs, peerURLs, and assignments to c.AdvertiseClientUrls/c.AdvertisePeerUrls.docs/installation/configuration.md (1)
12-17:⚠️ Potential issue | 🟡 MinorAdd the configuration precedence to this env-var section.
This explains how to encode array values, but it still doesn't say which source wins when the same setting is set multiple ways. Please spell out the order here: CLI flags > environment variables > config file.
Based on learnings: Multi-source configuration follows precedence: CLI flags > environment variables > config file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/installation/configuration.md` around lines 12 - 17, The env-var section demonstrating array encoding (e.g. PGEDGE_CLIENT_ADDRESSES / client_addresses) is missing the configuration-source precedence; update that paragraph to explicitly state the precedence order used by the application: CLI flags > environment variables > config file, and add a short sentence clarifying that when the same setting appears in multiple sources the higher-precedence source (flags first, then env vars, then the config file) wins so readers know which value will be applied.
🧹 Nitpick comments (1)
e2e/service_provisioning_test.go (1)
112-117: Require address publication instead of treating it as optional.This still passes when
serviceInstance.Statusnever gets populated, so an E2E regression in the newAddressescontract would be missed. I'd tighten this to wait forStatus/Addresses, then assert each entry is non-blank.Proposed tightening
- serviceInstance = db.ServiceInstances[0] - if serviceInstance.Status != nil { - t.Log("Verifying service instance connection info") - - // Verify basic connection info exists - assert.NotEmpty(t, serviceInstance.Status.Addresses, "Addresses should be set") - t.Logf("Service addresses: %s", strings.Join(serviceInstance.Status.Addresses, ", ")) + require.Eventually(t, func() bool { + if err := db.Refresh(ctx); err != nil { + return false + } + return len(db.ServiceInstances) > 0 && + db.ServiceInstances[0].Status != nil && + len(db.ServiceInstances[0].Status.Addresses) > 0 + }, 2*time.Minute, 5*time.Second, "Service status should publish at least one address") + + serviceInstance = db.ServiceInstances[0] + t.Log("Verifying service instance connection info") + for _, address := range serviceInstance.Status.Addresses { + assert.NotEmpty(t, strings.TrimSpace(address), "Addresses should not contain blank entries") + } + t.Logf("Service addresses: %s", strings.Join(serviceInstance.Status.Addresses, ", ")) // Verify ports are configured if len(serviceInstance.Status.Ports) > 0 { t.Logf("Service has %d ports configured", len(serviceInstance.Status.Ports)) @@ - } else { - t.Log("Service instance status not yet populated (this may be expected if container is still starting)") - } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@e2e/service_provisioning_test.go` around lines 112 - 117, Tighten the test so it explicitly waits for serviceInstance.Status to be populated and for serviceInstance.Status.Addresses to be non-empty, instead of treating Status as optional; poll (with timeout) for Status != nil and len(Status.Addresses) > 0, then assert each entry in serviceInstance.Status.Addresses is not blank (non-empty string) and log them, using the existing symbols serviceInstance, Status, Addresses and the test's t/assert helpers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/internal/api/apiv1/pre_init_handlers.go`:
- Around line 312-332: The loop over serverURLs currently returns on the first
url.Parse error; change the logic in the loop that builds apiClient (the block
that calls url.Parse(u), creates client.NewClient(...) and
apiClient.GetVersion(ctx)) to skip invalid/malformed URLs instead of returning
immediately—log or collect the parse errors and continue to the next u, attempt
GetVersion on each successfully parsed URL, and only return a consolidated error
after all serverURLs are exhausted (including a message about no reachable
servers and optionally the list of invalid entries); keep references to
url.Parse, client.NewClient, api.Client/GetVersion, and the final error returned
when none succeed.
In `@server/internal/config/config.go`:
- Around line 275-287: validateAddresses currently accepts nil/empty slices and
only checks format (IP or hostname) which lets bad client-facing values slip
through and allows PeerAddress() to return empty strings; update validation so
validateAddresses(name, addresses) returns an error when addresses is nil or has
length 0, and ensure callers validate both client_addresses and peer_addresses
(e.g., the code paths that compare client_addresses against c.PeerAddresses)
invoke validateAddresses so client-facing values are checked; also tighten
validateAddresses to reject hostnames/IPs that parse to empty and include clear
error messages referencing the offending name and index (use the existing
validateAddresses, PeerAddress(), client_addresses and PeerAddresses symbols to
locate and change the logic).
In `@server/internal/orchestrator/swarm/orchestrator.go`:
- Around line 239-248: The PostgresCerts InstanceAddresses currently only
includes instanceHostname and host-level addresses, but must also include the
overlay-network peer DNS name used by GetInstanceConnectionInfo() (PeerHost), so
add the FQDN "<instanceHostname>.<spec.DatabaseID>-database" to
PostgresCerts.InstanceAddresses; update the slices.Concat call that builds
InstanceAddresses to include this computed FQDN (use spec.DatabaseID and
instanceHostname), ensuring PostgresCerts.InstanceAddresses contains
instanceHostname, the overlay FQDN, and o.cfg.Addresses() so peer TLS hostname
verification succeeds.
---
Duplicate comments:
In `@docs/installation/configuration.md`:
- Around line 12-17: The env-var section demonstrating array encoding (e.g.
PGEDGE_CLIENT_ADDRESSES / client_addresses) is missing the configuration-source
precedence; update that paragraph to explicitly state the precedence order used
by the application: CLI flags > environment variables > config file, and add a
short sentence clarifying that when the same setting appears in multiple sources
the higher-precedence source (flags first, then env vars, then the config file)
wins so readers know which value will be applied.
In `@server/internal/config/config.go`:
- Around line 364-370: defaultAddresses() currently returns an error when
getFirstIP() fails (because it only finds IPv4), which breaks IPv6-only hosts;
change defaultAddresses() to try getFirstIP() for an IPv4 first, and if that
returns a not-found/no-IPv4 error, scan for the first usable IPv6 address (using
the same network/interface address enumeration logic used elsewhere in the file)
and return that IPv6 string instead of an error; update any related helper logic
referenced in the file (the getFirstIP() call sites and the address-selection
block around lines 373-415) to ensure IPv6-only endpoints are accepted and
returned by defaultAddresses().
In `@server/internal/etcd/embedded.go`:
- Around line 671-684: The code advertises every cfg.PeerAddresses entry via
c.AdvertiseClientUrls/c.AdvertisePeerUrls but still only binds 0.0.0.0 in
c.ListenClientUrls/c.ListenPeerUrls, so IPv6 literals will be advertised but not
listened on; update the listener construction to mirror the advertise logic:
build ListenClientUrls and ListenPeerUrls from cfg.PeerAddresses (use
url.URL{Scheme:"https", Host: net.JoinHostPort(address,
strconv.Itoa(clientPort))} and similar for peerPort) instead of hardcoding
0.0.0.0, ensuring IPv6 addresses are included. Apply changes around the code
that sets c.ListenClientUrls, c.ListenPeerUrls, clientURLs, peerURLs, and
assignments to c.AdvertiseClientUrls/c.AdvertisePeerUrls.
---
Nitpick comments:
In `@e2e/service_provisioning_test.go`:
- Around line 112-117: Tighten the test so it explicitly waits for
serviceInstance.Status to be populated and for serviceInstance.Status.Addresses
to be non-empty, instead of treating Status as optional; poll (with timeout) for
Status != nil and len(Status.Addresses) > 0, then assert each entry in
serviceInstance.Status.Addresses is not blank (non-empty string) and log them,
using the existing symbols serviceInstance, Status, Addresses and the test's
t/assert helpers.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 72bad633-35de-4a37-ae9d-df4814c2f0e7
⛔ Files ignored due to path filters (12)
api/apiv1/gen/control_plane/service.gois excluded by!**/gen/**api/apiv1/gen/http/cli/control_plane/cli.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/cli.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/encode_decode.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/types.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/server/encode_decode.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/server/types.gois excluded by!**/gen/**api/apiv1/gen/http/openapi.jsonis excluded by!**/gen/**api/apiv1/gen/http/openapi.yamlis excluded by!**/gen/**api/apiv1/gen/http/openapi3.jsonis excluded by!**/gen/**api/apiv1/gen/http/openapi3.yamlis excluded by!**/gen/**go.sumis excluded by!**/*.sum
📒 Files selected for processing (48)
NOTICE.txtapi/apiv1/design/cluster.goapi/apiv1/design/database.goapi/apiv1/design/host.goapi/apiv1/design/instance.gochanges/unreleased/Added-20260306-160121.yamlchanges/unreleased/Changed-20260306-155827.yamlchanges/unreleased/Changed-20260306-160015.yamldocker/control-plane-ci/docker-compose.yamldocker/control-plane-dev/config.jsondocs/development/supported-services.mddocs/installation/configuration.mde2e/database_test.goe2e/fixtures/roles/deploy_control_plane/templates/stack.yaml.tmple2e/service_provisioning_test.gogo.modhack/dev-env.zshserver/cmd/root.goserver/internal/api/apiv1/convert.goserver/internal/api/apiv1/post_init_handlers.goserver/internal/api/apiv1/pre_init_handlers.goserver/internal/certificates/service.goserver/internal/certificates/templates.goserver/internal/config/config.goserver/internal/config/sources.goserver/internal/database/instance.goserver/internal/database/orchestrator.goserver/internal/database/service.goserver/internal/database/service_instance.goserver/internal/etcd/embedded.goserver/internal/etcd/embedded_test.goserver/internal/etcd/interface.goserver/internal/etcd/rbac.goserver/internal/etcd/remote.goserver/internal/etcd/remote_test.goserver/internal/host/host.goserver/internal/host/host_store.goserver/internal/host/service.goserver/internal/monitor/instance_monitor.goserver/internal/monitor/service.goserver/internal/orchestrator/swarm/orchestrator.goserver/internal/orchestrator/swarm/postgres_certs.goserver/internal/orchestrator/swarm/service_user_role.goserver/internal/workflows/activities/check_cluster_health.goserver/internal/workflows/activities/failover.goserver/internal/workflows/activities/get_primary_instance.goserver/internal/workflows/activities/restart_instance.goserver/internal/workflows/activities/switchover.go
💤 Files with no reviewable changes (2)
- hack/dev-env.zsh
- server/internal/monitor/service.go
✅ Files skipped from review due to trivial changes (1)
- changes/unreleased/Changed-20260306-155827.yaml
🚧 Files skipped from review as they are similar to previous changes (15)
- server/cmd/root.go
- server/internal/database/instance.go
- e2e/database_test.go
- server/internal/api/apiv1/convert.go
- server/internal/etcd/interface.go
- server/internal/orchestrator/swarm/service_user_role.go
- server/internal/host/service.go
- docs/development/supported-services.md
- server/internal/etcd/remote.go
- changes/unreleased/Added-20260306-160121.yaml
- server/internal/etcd/embedded_test.go
- e2e/fixtures/roles/deploy_control_plane/templates/stack.yaml.tmpl
- server/internal/certificates/service.go
- docker/control-plane-dev/config.json
- api/apiv1/design/host.go
06352a5 to
21bd381
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
server/internal/orchestrator/swarm/service_user_role.go (1)
224-225:⚠️ Potential issue | 🟠 MajorDon't route the delete path through
database.Service.GetInstanceConnectionInfo.
DeletetreatsconnectToPrimaryfailures as expected shutdown noise, but these calls now add a store/existence lookup before the orchestrator hop. That means transient store/etcd failures here get downgraded to “skip DROP ROLE”, which is broader than the intended “Postgres is already going away” case. Keep this helper on the direct orchestrator lookup, or distinguish store failures beforeDeleteswallows the error.Based on learnings: GetDatabase errors should be propagated because they represent transient infrastructure failures, while only expected Postgres connection errors during
DROP ROLEshould be swallowed.Also applies to: 240-242
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/orchestrator/swarm/service_user_role.go` around lines 224 - 225, The delete path is incorrectly calling database.Service.GetInstanceConnectionInfo (connInfo, err := dbSvc.GetInstanceConnectionInfo(...)) which lets store/etcd lookup failures be swallowed as "skip DROP ROLE"; instead, keep the lookup on the orchestrator side or detect and propagate store failures before Delete swallows them: replace the call to dbSvc.GetInstanceConnectionInfo with the direct orchestrator instance lookup used elsewhere (or call the orchestrator's GetInstanceConnectionInfo helper), or explicitly inspect GetDatabase/GetInstanceConnectionInfo errors and return them (propagate transient store errors) while only swallowing expected Postgres connection errors from connectToPrimary/Delete; apply the same change to the other occurrence around lines 240-242.api/apiv1/design/cluster.go (1)
73-76:⚠️ Potential issue | 🟠 MajorRaise the per-address cap here to 253.
server/internal/config/config.gonow accepts hostnames up to 253 chars, andserver/internal/api/apiv1/pre_init_handlers.gosendss.cfg.PeerAddressesthrough this request shape. Keepingaddresses[*]at 128 means a valid long FQDN can be stored locally but still gets rejected at the join API boundary.📏 Suggested fix
g.Elem(func() { g.MinLength(3) - g.MaxLength(128) + g.MaxLength(253) })Please verify
ClusterJoinRequestis still the incomingGetJoinOptionspayload after regeneration; if it is, this cap is user-facing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/apiv1/design/cluster.go` around lines 73 - 76, Update the per-address max length in the ClusterJoinRequest schema by changing the validator on the addresses element from g.MaxLength(128) to g.MaxLength(253) (the block using g.Elem and g.MinLength(3)/g.MaxLength(...)); regenerate the API shapes and verify that ClusterJoinRequest is still the incoming GetJoinOptions payload so this user-facing cap is applied consistently. Ensure tests/consumers expecting 128 are updated if needed.server/internal/config/config.go (1)
367-418:⚠️ Potential issue | 🟠 Major
DefaultConfig()still can't auto-discover IPv6-only hosts.
defaultAddresses()now drives bothPeerAddressesandClientAddresses, butgetFirstIP()discards every address whoseTo4()is nil. On an IPv6-only node this returnscould not find a valid network interface, so the new defaulted install path still fails unless the user manually overrides both address lists.Based on learnings,
DefaultConfig()automatically populates bothPeerAddressesandClientAddressesviadefaultAddresses()/getFirstIP(), so failures here hit the default install path directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/config/config.go` around lines 367 - 418, defaultAddresses()/getFirstIP() currently rejects non-IPv4 addresses which breaks auto-discovery on IPv6-only hosts; change getFirstIP() to accept a valid non-loopback IP even if ip.To4() == nil (prefer IPv4 if found but fall back to the first non-loopback IPv6), so DefaultConfig() can populate PeerAddresses and ClientAddresses on IPv6-only systems; update getFirstIP() logic to return the first non-loopback IP (IPv4 preferred, otherwise IPv6) and adjust error message accordingly.server/internal/etcd/embedded.go (1)
671-676:⚠️ Potential issue | 🟠 MajorAdvertised IPv6 peers are still unreachable with these listeners.
The advertise URLs now format IPv6 correctly, but both listeners still bind only
0.0.0.0. If anycfg.PeerAddressesentry is IPv6, etcd will advertisehttps://[...]:portwithout actually listening on an IPv6 socket, so peer/client traffic to that address will fail. Please add a dual-stack or explicit[::]listener when IPv6 peer addresses are configured.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/internal/etcd/embedded.go` around lines 671 - 676, The listeners currently bind only to 0.0.0.0 (c.ListenClientUrls / c.ListenPeerUrls) which breaks reachability when cfg.PeerAddresses contains IPv6 addresses; update the code that sets these fields to detect if any cfg.PeerAddresses is IPv6 (check for ':' or use net.ParseIP/To16) and then either (a) include an IPv6 any address entry ("[::]") alongside the existing 0.0.0.0 URL or (b) replace the listener with a dual-stack "[::]" URL for both c.ListenClientUrls and c.ListenPeerUrls so the advertised https://[ipv6]:port addresses are actually reachable; reference clientPort and peerPort when constructing the new url.URL entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/internal/api/apiv1/pre_init_handlers.go`:
- Around line 306-327: The loop in PreInitHandlers.apiClient calls
apiClient.GetVersion(ctx) for each server URL with the incoming context that may
have no deadline, risking indefinite hangs; wrap each probe in a per-attempt
deadline by creating a child context with context.WithTimeout (e.g., 5s or
configurable) before calling apiClient.GetVersion and defer cancel() so stalled
servers fail fast and the loop can try the next URL; update imports to include
time. Ensure the timeout is applied only to the probe (use the new ctx returned
by context.WithTimeout when calling GetVersion) and cancel the timer to avoid
leaks.
---
Duplicate comments:
In `@api/apiv1/design/cluster.go`:
- Around line 73-76: Update the per-address max length in the ClusterJoinRequest
schema by changing the validator on the addresses element from g.MaxLength(128)
to g.MaxLength(253) (the block using g.Elem and
g.MinLength(3)/g.MaxLength(...)); regenerate the API shapes and verify that
ClusterJoinRequest is still the incoming GetJoinOptions payload so this
user-facing cap is applied consistently. Ensure tests/consumers expecting 128
are updated if needed.
In `@server/internal/config/config.go`:
- Around line 367-418: defaultAddresses()/getFirstIP() currently rejects
non-IPv4 addresses which breaks auto-discovery on IPv6-only hosts; change
getFirstIP() to accept a valid non-loopback IP even if ip.To4() == nil (prefer
IPv4 if found but fall back to the first non-loopback IPv6), so DefaultConfig()
can populate PeerAddresses and ClientAddresses on IPv6-only systems; update
getFirstIP() logic to return the first non-loopback IP (IPv4 preferred,
otherwise IPv6) and adjust error message accordingly.
In `@server/internal/etcd/embedded.go`:
- Around line 671-676: The listeners currently bind only to 0.0.0.0
(c.ListenClientUrls / c.ListenPeerUrls) which breaks reachability when
cfg.PeerAddresses contains IPv6 addresses; update the code that sets these
fields to detect if any cfg.PeerAddresses is IPv6 (check for ':' or use
net.ParseIP/To16) and then either (a) include an IPv6 any address entry ("[::]")
alongside the existing 0.0.0.0 URL or (b) replace the listener with a dual-stack
"[::]" URL for both c.ListenClientUrls and c.ListenPeerUrls so the advertised
https://[ipv6]:port addresses are actually reachable; reference clientPort and
peerPort when constructing the new url.URL entries.
In `@server/internal/orchestrator/swarm/service_user_role.go`:
- Around line 224-225: The delete path is incorrectly calling
database.Service.GetInstanceConnectionInfo (connInfo, err :=
dbSvc.GetInstanceConnectionInfo(...)) which lets store/etcd lookup failures be
swallowed as "skip DROP ROLE"; instead, keep the lookup on the orchestrator side
or detect and propagate store failures before Delete swallows them: replace the
call to dbSvc.GetInstanceConnectionInfo with the direct orchestrator instance
lookup used elsewhere (or call the orchestrator's GetInstanceConnectionInfo
helper), or explicitly inspect GetDatabase/GetInstanceConnectionInfo errors and
return them (propagate transient store errors) while only swallowing expected
Postgres connection errors from connectToPrimary/Delete; apply the same change
to the other occurrence around lines 240-242.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3c0bba43-6868-455a-8dc6-aa30bebc4b13
⛔ Files ignored due to path filters (12)
api/apiv1/gen/control_plane/service.gois excluded by!**/gen/**api/apiv1/gen/http/cli/control_plane/cli.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/cli.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/encode_decode.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/client/types.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/server/encode_decode.gois excluded by!**/gen/**api/apiv1/gen/http/control_plane/server/types.gois excluded by!**/gen/**api/apiv1/gen/http/openapi.jsonis excluded by!**/gen/**api/apiv1/gen/http/openapi.yamlis excluded by!**/gen/**api/apiv1/gen/http/openapi3.jsonis excluded by!**/gen/**api/apiv1/gen/http/openapi3.yamlis excluded by!**/gen/**go.sumis excluded by!**/*.sum
📒 Files selected for processing (48)
NOTICE.txtapi/apiv1/design/cluster.goapi/apiv1/design/database.goapi/apiv1/design/host.goapi/apiv1/design/instance.gochanges/unreleased/Added-20260306-160121.yamlchanges/unreleased/Changed-20260306-155827.yamlchanges/unreleased/Changed-20260306-160015.yamldocker/control-plane-ci/docker-compose.yamldocker/control-plane-dev/config.jsondocs/development/supported-services.mddocs/installation/configuration.mde2e/database_test.goe2e/fixtures/roles/deploy_control_plane/templates/stack.yaml.tmple2e/service_provisioning_test.gogo.modhack/dev-env.zshserver/cmd/root.goserver/internal/api/apiv1/convert.goserver/internal/api/apiv1/post_init_handlers.goserver/internal/api/apiv1/pre_init_handlers.goserver/internal/certificates/service.goserver/internal/certificates/templates.goserver/internal/config/config.goserver/internal/config/sources.goserver/internal/database/instance.goserver/internal/database/orchestrator.goserver/internal/database/service.goserver/internal/database/service_instance.goserver/internal/etcd/embedded.goserver/internal/etcd/embedded_test.goserver/internal/etcd/interface.goserver/internal/etcd/rbac.goserver/internal/etcd/remote.goserver/internal/etcd/remote_test.goserver/internal/host/host.goserver/internal/host/host_store.goserver/internal/host/service.goserver/internal/monitor/instance_monitor.goserver/internal/monitor/service.goserver/internal/orchestrator/swarm/orchestrator.goserver/internal/orchestrator/swarm/postgres_certs.goserver/internal/orchestrator/swarm/service_user_role.goserver/internal/workflows/activities/check_cluster_health.goserver/internal/workflows/activities/failover.goserver/internal/workflows/activities/get_primary_instance.goserver/internal/workflows/activities/restart_instance.goserver/internal/workflows/activities/switchover.go
💤 Files with no reviewable changes (2)
- hack/dev-env.zsh
- server/internal/monitor/service.go
🚧 Files skipped from review as they are similar to previous changes (20)
- server/cmd/root.go
- docker/control-plane-ci/docker-compose.yaml
- changes/unreleased/Changed-20260306-155827.yaml
- changes/unreleased/Added-20260306-160121.yaml
- server/internal/api/apiv1/post_init_handlers.go
- server/internal/workflows/activities/restart_instance.go
- api/apiv1/design/database.go
- e2e/database_test.go
- server/internal/database/instance.go
- api/apiv1/design/instance.go
- docs/development/supported-services.md
- server/internal/host/service.go
- server/internal/etcd/remote.go
- server/internal/etcd/remote_test.go
- server/internal/certificates/service.go
- server/internal/etcd/interface.go
- changes/unreleased/Changed-20260306-160015.yaml
- server/internal/certificates/templates.go
- server/internal/etcd/embedded_test.go
- server/internal/api/apiv1/convert.go
Replaces the `Hostname` and `IPv4Address` configuration settings with two new settings: - `PeerAddresses`: Addresses that the host advertises to other hosts in the cluster - `ClientAddresses`: Addresses that the host advertises to API clients This solves two main issues: - Some topologies use different networks for intra-cluster communcation and database client communication. Separating these settings enables users to, for example, return an externally-available IP from the control plane API while keeping Etcd, Spock, and Patroni traffic on internal networks. - Some topologies will not have DNS between machines. In those cases, we were producing a lot of warnings because we used hostnames in many places. Each new settings is multi-valued and agnostic to whether each address is a hostname, an FQDN, an IPv4, or even an IPv6. I've also added a default value for the `host_id` configuration setting. It will now default to the short hostname, such as you'd get from `hostname -s`. This simplifies the installation process the package scripts for SystemD. PLAT-417 PLAT-220
21bd381 to
2d6598c
Compare
tsivaprasad
left a comment
There was a problem hiding this comment.
Verification:
- Create a Cluster - Ensure the control plane cluster is up and running.
- Create a Database - restish control-plane-local-1 create-database < ../demo/417/db_create.json
Response:
{
"database": {
"id": "storefront",
"state": "creating",
"created_at": "2026-03-11T06:20:30Z",
"updated_at": "2026-03-11T06:20:30Z",
"spec": {
"database_name": "storefront",
"database_users": [
{
"username": "admin",
"attributes": ["SUPERUSER", "LOGIN"],
"db_owner": true
}
],
"nodes": [
{
"name": "n1",
"host_ids": ["host-1"]
}
],
"port": 5433,
"postgres_version": "18.2",
"spock_version": "5"
}
},
"task": {
"task_id": "019cdb8d-cfbc-79c3-97da-10e68424e43d",
"database_id": "storefront",
"entity_id": "storefront",
"scope": "database",
"type": "create",
"status": "pending",
"created_at": "2026-03-11T06:20:30Z"
}
}
- Verify Database Creation with connection info - restish control-plane-local-1 list-databases
{
"databases": [
{
"id": "storefront",
"state": "available",
"instances": [
{
"id": "storefront-n1-689qacsi",
"node_name": "n1",
"host_id": "host-1",
"state": "available",
"connection_info": {
"addresses": ["127.0.0.1"],
"port": 5433
},
"postgres": {
"role": "primary",
"version": "18.2",
"patroni_state": "running"
},
"spock": {
"version": "5.0.5",
"read_only": "off"
}
}
]
}
]
}
- Connect to the Database -
PGPASSWORD=password psql -U admin -h 127.0.0.1 -p 5433 storefront - Verify Database Functionality
create table foo(id int, name char(20));
Output:
INFO: DDL statement replicated.
CREATE TABLE
Verify table creation:
\dt
Output:
List of tables
Schema | Name | Type | Owner
--------+------+-------+-------
public | foo | table | admin
(1 row)
Summary
Replaces the
HostnameandIPv4Addressconfiguration settings with two new settings:PeerAddresses: Addresses that the host advertises to other hosts in the clusterClientAddresses: Addresses that the host advertises to API clientsThis solves two main issues:
Each new setting is multi-valued and agnostic to whether each address is a hostname, an FQDN, an IPv4, or even an IPv6.
I've also added a default value for the
host_idconfiguration setting. It will now default to the short hostname, such as you'd get fromhostname -s. This simplifies the installation process the package scripts for SystemD.Testing
Using either the compose dev or e2e environments:
PLAT-417
PLAT-220