Skip to content

[magnum-auto-healer] Add TLS cert verification#3110

Open
stephenfin wants to merge 2 commits into
kubernetes:masterfrom
shiftstack:autohealing-tls-cert-support
Open

[magnum-auto-healer] Add TLS cert verification#3110
stephenfin wants to merge 2 commits into
kubernetes:masterfrom
shiftstack:autohealing-tls-cert-support

Conversation

@stephenfin
Copy link
Copy Markdown
Member

@stephenfin stephenfin commented May 7, 2026

What this PR does / why we need it:

When the protocol is HTTPS, we were creating a HTTP transport with InsecureSkipVerify: true which causes Go's TLS client to skip all certificate verification. This leaves the connection open to man-in-the-middle attacks.

I suspect things were done this way since the function connects directly to a node's raw IP address, but Kubernetes API server TLS certificates are typically issued for hostnames or DNS names (e.g., api.cluster.example.com) rather than raw IPs. Certificate hostname verification would fail unless the certificate includes the node IP as a Subject Alternative Name (SAN), which isn't guaranteed. Additionally, the Kubernetes cluster CA cert that signed the API server's certificate isn't typically included in theOS trust store and must be loaded explicitly. Neither of these issues are insurmountable though, and we should be secure-by-default.

Fix things by adding a configurable CA file and an explicit opt-in insecure-skip-verify flag, defaulting to the in-cluster Kubernetes CA.

Which issue this PR fixes(if applicable):

(none)

Special notes for reviewers:

Please note the AI was used to assist in generation of this patch. This is noted in the relevant commit messages.

Release note:

[magnum-auto-healer] Default to certificate verification with HTTPS

`magnum-auto-healer` now defaults to verifying certificates when using HTTPS
transport. When used, the API server cert will need to use node IPs as SANs.
A new `insecure-skip-verify` option is provided to allow users to opt into the
previous insecure behavior, which can be helpful during upgrades.

stephenfin added 2 commits May 7, 2026 14:02
…alth check

Replace the hardcoded InsecureSkipVerify: true with proper TLS certificate
verification using the in-cluster Kubernetes CA by default. Add ca-file and
insecure-skip-verify config options so operators can override behaviour when
needed (e.g. when the API server cert does not include node IPs as SANs).

Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Assisted-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… guide

Add a parameter reference table covering all Endpoint check options,
including the newly added ca-file and insecure-skip-verify fields.

Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 7, 2026
@k8s-ci-robot k8s-ci-robot requested review from kayrus and zetaab May 7, 2026 14:13
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zetaab for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 7, 2026
@stephenfin
Copy link
Copy Markdown
Member Author

/test pull-cloud-provider-openstack-check

@kayrus
Copy link
Copy Markdown
Contributor

kayrus commented May 7, 2026

@stephenfin does it make sense to reuse pkg/client/client.go code here?

@stephenfin
Copy link
Copy Markdown
Member Author

stephenfin commented May 8, 2026

@stephenfin does it make sense to reuse pkg/client/client.go code here?

There's some overlap but pkg/client/client.go seems tailored towards creating an OpenStack client, with an auth_url and the likes, and I don't think we want to require a clouds.yaml to use this plugin? I could drag out the building of the TLSClientConfig field to helper function but very little of that would actually be used (we don't need equivalents of CertFile and KeyFile, for example) so again, I'm not sure how much this would get us. Have I missed something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants