Skip to content

feat(cni-installer): implement cni installer in go#761

Open
raykroeker wants to merge 15 commits into
mainfrom
raykroeker/006-cni-productization-install
Open

feat(cni-installer): implement cni installer in go#761
raykroeker wants to merge 15 commits into
mainfrom
raykroeker/006-cni-productization-install

Conversation

@raykroeker

@raykroeker raykroeker commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

This change implements rfc6 - installer reliability. It removes race conditions using a single thread to handle file-system events, leverages a shared library for those monitoring and handling those events, and provides unit test coverage.

The interface consists of a new command binary and Installer type used to perform both install and removal:

  • cni-install/main.go: Runs the installer as well as the removal on exit.
  • pkg/cni/install.go: Defines the factory and interface for the Installer type.

Install

The implementation walks through 3 initial steps:

  • Copy binaries to a target host path.
  • Write a kube config file from a template, injecting a number of variables from the environment.
  • Edit CNI configuration file(s). Injecting the linkerd cni plugin at the tail end of the config.

After the completion of these steps, the installer establishes a watch on:

  • The parent directory of the api service account token. Any changes to the ..data directory trigger a reconfiguration of the kube config file.
  • The parent directory of the CNI configuration tree. Changes to the files trigger a reconfiguration of the file (in the event the file changes).

Remove

Changes to files are tracked in the installer's in-memory log. When the command exits/quits (via signal.Notify) each log event is reverted:

  • Installed binaries are removed.
  • The kube config file is removed.
  • The CNI config file changes are reverted.

raykroeker added 12 commits June 5, 2026 16:51
 * Binary command cni-install.
 * CNI package that wraps the installer behaviour with an Installer type
   and Run method.
   * Copy files (e.g. the cni plugin) to a destination from a source
     defined by the environment:
       * Destination: env:CONTAINER_MOUNT_PREFIX / env:DEST_CNI_BIN_DIR
       * Source: env:CONTAINER_CNI_BIN_DIR
   * Configure CNI based on either an environment or a file source:
       * env:CNI_NETWORK_CONFIG
       * file located at env:CNI_NETWORK_CONFIG_FILE
     and write it to a cni config file at
       * env:CONTAINER_MOUNT_PREFIX / env:DEST_CNI_NET_DIR
   * Configure kubeconfig for the plugin from static config injected
     with an authn token located at a file:
       * Kube config file: env:CONTAINER_MOUNT_PREFIX /
         env:DEST_CNI_NET_DIR / env:KUBECONFIG_FILE_NAME
       * Authn token file:
	 /var/run/secrets/kubernetes.io/serviceaccount/token
    * Watch the auth token file for fs events (create|rename|write)
      before rewriting the kube config.
    * Watch the cni config root for fs events (create|rename|write) and
      rewrite config files injecting the linkerd configuration as a tail
      plugin in the list
 * Filter on and assert on specific fs events.
 * Bump timeout on events to 1s (hoping to solve CI/CD failures).
Test docker image with local integration test suite. Fix re-reading
sources, add better logging, ignore fs events that don't impact
anything.
Add removal to the installer.  Walk a log of entries that have been
applied and revert them.

Update the integration test manifests to drop the expiry of the service
account token to 10m from 1h.

Watch the service account parent directory versus the token file
directly which is a link.  Filter on changes to ..data

Rename '.conf' files to '.conflist' when injecting linkered cni config.
@raykroeker raykroeker changed the title Implement Installer feat(cni-installer): implement cni installer in go Jun 12, 2026
kubernetes.io/os: linux
hostNetwork: true
serviceAccountName: linkerd-cni
automountServiceAccountToken: false

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to the test-scenario files drop the service account token's lifecycle down to 10m (a hard floor).

Comment thread pkg/cni/conf.go
clusters:
- name: local
cluster:
server: {{ .ServiceProtocol }}://{{ .ServiceHost }}:{{ .ServicePort }}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like protocol being configurable. This is how the previous installer was implemented. I'd like to remove this.

Comment thread pkg/cni/conf.go
// data.
CertificateAuthorityData string
// SkipTLSVerify sets tls config to be insecure.
SkipTLSVerify bool

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like skip tls verify being configurable. This is how the previous installer was implemented. I'd like to remove this.

@raykroeker raykroeker marked this pull request as ready for review June 12, 2026 19:02
@raykroeker raykroeker requested a review from a team as a code owner June 12, 2026 19:02
Insert a trailing space on non-code files (test data). Trim space from
the test data files on read.  Update test case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant