Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions .github/workflows/integration-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ jobs:
libgpgme-dev \
libbtrfs-dev \
libdevmapper-dev \
libvirt-dev \
pkg-config

- name: Configure Podman
Expand All @@ -71,7 +72,7 @@ jobs:
sudo podman info --format '{{.Store.GraphRoot}}'

- name: Build bink binary
run: sudo make build-bink
run: make build-bink

- name: Verify prerequisites
run: |
Expand All @@ -80,9 +81,11 @@ jobs:
df -h /
free -h

- name: Build cluster image from branch
run: sudo make build-cluster-image

- name: Pre-pull container images
run: |
sudo podman pull ghcr.io/alicefr/bink/cluster:latest
sudo podman pull ghcr.io/alicefr/bink/node:v1.35-fedora-44-disk
sudo podman pull ghcr.io/alicefr/bink/dns:latest
sudo podman pull docker.io/library/registry:2
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/test-container-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,9 @@ jobs:
- name: Build bink image
run: sudo make build-bink-image

- name: Build cluster image
run: sudo make build-cluster-image

- name: Test nested mode
run: sudo hack/test-container-image.sh nested
timeout-minutes: 40
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ jobs:
libgpgme-dev \
libbtrfs-dev \
libdevmapper-dev \
libvirt-dev \
pkg-config

- name: Run unit tests
Expand Down
10 changes: 5 additions & 5 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,13 +59,13 @@ Bink manages multi-node Kubernetes clusters where each node runs as a rootless P

Each Kubernetes node is a Podman container running the `localhost/cluster:latest` image (Fedora 43 with libvirt, QEMU, and virtiofsd). Containers are named `k8s-<cluster>-<node>` (e.g., `k8s-dev-node1`) and labeled with `bink.cluster-name` and `bink.node-name` for discovery.

The container runs four libvirt daemons (`virtlogd`, `virtstoraged`, `virtnetworkd`, `virtqemud`) and a `virtiofsd` instance. It requires `/dev/kvm` for hardware virtualization and `/dev/fuse` for virtiofs, plus `SYS_ADMIN` capability. SELinux is disabled inside the container.
The container runs the monolithic `libvirtd` daemon (with TCP socket on port 16509 for the Go bindings to connect) along with `virtlogd` and a `virtiofsd` instance. All modular libvirt daemons (`virtqemud`, `virtproxyd`, `virtnetworkd`, `virtstoraged`) are masked to avoid conflicts with the monolithic daemon. It requires `/dev/kvm` for hardware virtualization and `/dev/fuse` for virtiofs, plus `SYS_ADMIN` capability. SELinux is disabled inside the container.

Control-plane containers publish port 6443 to a random host port for API access within the cluster. External API access from the host goes through the HAProxy load balancer (see below).
All containers publish the libvirt TCP port (16509) to a random host port so bink can connect to libvirtd from the host via the Go bindings. Control-plane containers additionally publish port 6443 for API access within the cluster. External API access from the host goes through the HAProxy load balancer (see below).

### Virtual Machine

Inside each container, a Fedora bootc VM runs via libvirt/QEMU. The VM boots from a qcow2 overlay disk backed by a shared read-only base image (`fedora-bootc-k8s.qcow2`). Cloud-init configures the VM on first boot: hostname, networking, SSH keys, CRI-O, kubelet, and kernel parameters.
Inside each container, a Fedora bootc VM is defined and started using the libvirt Go bindings (`libvirt.org/go/libvirt` and `libvirt.org/go/libvirtxml`). Bink connects to the monolithic libvirtd via `qemu+tcp://localhost:<port>/session`, constructs the domain XML programmatically, and calls `DomainDefineXML` + `Domain.Create`. The VM boots from a qcow2 overlay disk backed by a shared read-only base image (`fedora-bootc-k8s.qcow2`). Cloud-init configures the VM on first boot: hostname, networking, SSH keys, CRI-O, kubelet, and kernel parameters.

The VM runs:
- **CRI-O** as the container runtime
Expand Down Expand Up @@ -208,9 +208,9 @@ A cluster starts with a single control-plane node (`node1`) and can grow by addi
1. Create the Podman bridge network for the cluster
2. Create the `cluster-keys` volume and generate an SSH key pair (RSA 4096-bit)
3. Ensure the global `cluster-images` volume is populated
4. Create the node1 container with libvirt daemons
4. Create the node1 container with monolithic libvirtd
5. Create a qcow2 overlay disk and a cloud-init ISO
6. Boot the VM via virt-install with dual NICs and virtiofs
6. Define and start the VM via libvirt Go bindings (`libvirt.org/go/libvirt`) with dual NICs and virtiofs
7. Wait for cloud-init to complete (configures networking, CRI-O, kubelet)
8. Run `kubeadm init` with the node's cluster IP as the advertise address
9. Install Calico CNI and patch CoreDNS for CRI-O compatibility
Expand Down
2 changes: 2 additions & 0 deletions Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ RUN dnf install -y \
gpgme-devel \
btrfs-progs-devel \
device-mapper-devel \
libvirt-devel \
&& dnf clean all

WORKDIR /build
Expand All @@ -32,6 +33,7 @@ RUN dnf install -y \
gpgme \
podman \
kubernetes-client \
libvirt-libs \
&& dnf clean all

COPY --from=builder /output/bink /usr/local/bin/bink
Expand Down
18 changes: 11 additions & 7 deletions containerfiles/cluster-image/Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ RUN dnf install -y --setopt=install_weak_deps=0 \
libvirt-daemon-driver-storage-core \
libvirt-daemon-driver-network \
qemu-kvm \
virt-install \
virtiofsd \
passt \
iputils \
Expand All @@ -19,20 +18,25 @@ RUN dnf install -y --setopt=install_weak_deps=0 \
&& dnf clean all

COPY qemu.conf /etc/libvirt/qemu.conf
COPY virtqemud.conf /etc/libvirt/virtqemud.conf
COPY libvirtd.conf /etc/libvirt/libvirtd.conf
COPY virtiofsd-wrapper /usr/local/bin/virtiofsd-wrapper
COPY virtiofsd.service /etc/systemd/system/virtiofsd.service

RUN mkdir -p /etc/systemd/system/virtqemud.service.d
COPY virtqemud-override.conf /etc/systemd/system/virtqemud.service.d/override.conf
RUN mkdir -p /etc/systemd/system/libvirtd.service.d
COPY libvirtd-override.conf /etc/systemd/system/libvirtd.service.d/override.conf

RUN chmod +x /usr/local/bin/virtiofsd-wrapper
RUN mkdir -p /home/qemu && chown -R qemu:qemu /home/qemu
RUN echo 'root:100000:65536' > /etc/subuid && \
echo 'root:100000:65536' > /etc/subgid
RUN systemctl enable virtqemud.service virtlogd.service virtstoraged.service \
virtnetworkd.service virtiofsd.service && \
systemctl mask systemd-logind.service getty.target console-getty.service
RUN systemctl enable libvirtd.socket libvirtd-tcp.socket \
virtlogd.service virtiofsd.service && \
systemctl mask \
virtqemud.service virtqemud.socket virtqemud-ro.socket virtqemud-admin.socket \
virtproxyd.service virtproxyd.socket virtproxyd-ro.socket virtproxyd-admin.socket \
virtnetworkd.service virtnetworkd.socket virtnetworkd-ro.socket virtnetworkd-admin.socket \
virtstoraged.service virtstoraged.socket virtstoraged-ro.socket virtstoraged-admin.socket \
systemd-logind.service getty.target console-getty.service

STOPSIGNAL SIGRTMIN+3
ENTRYPOINT ["/sbin/init"]
2 changes: 2 additions & 0 deletions containerfiles/cluster-image/libvirtd-override.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[Service]
Environment=LIBVIRTD_ARGS="--timeout 0"
2 changes: 2 additions & 0 deletions containerfiles/cluster-image/libvirtd.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
auth_tcp = "none"
log_outputs = "1:stderr"
2 changes: 0 additions & 2 deletions containerfiles/cluster-image/virtqemud-override.conf

This file was deleted.

3 changes: 0 additions & 3 deletions containerfiles/cluster-image/virtqemud.conf

This file was deleted.

2 changes: 2 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ require (
k8s.io/api v0.35.0
k8s.io/apimachinery v0.35.0
k8s.io/client-go v0.35.0
libvirt.org/go/libvirt v1.12003.0
libvirt.org/go/libvirtxml v1.12002.0
sigs.k8s.io/yaml v1.6.0
)

Expand Down
4 changes: 4 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -648,6 +648,10 @@ k8s.io/kube-openapi v0.0.0-20250910181357-589584f1c912 h1:Y3gxNAuB0OBLImH611+UDZ
k8s.io/kube-openapi v0.0.0-20250910181357-589584f1c912/go.mod h1:kdmbQkyfwUagLfXIad1y2TdrjPFWp2Q89B3qkRwf/pQ=
k8s.io/utils v0.0.0-20251002143259-bc988d571ff4 h1:SjGebBtkBqHFOli+05xYbK8YF1Dzkbzn+gDM4X9T4Ck=
k8s.io/utils v0.0.0-20251002143259-bc988d571ff4/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0=
libvirt.org/go/libvirt v1.12003.0 h1:3ek4ObakscdShZRloa9s8/mGhK7xVduqNmAkb15ZEDQ=
libvirt.org/go/libvirt v1.12003.0/go.mod h1:1WiFE8EjZfq+FCVog+rvr1yatKbKZ9FaFMZgEqxEJqQ=
libvirt.org/go/libvirtxml v1.12002.0 h1:NbEHw+R3IZE0vZF1deCQt+6tA+6Io4pAw9RjS7tM4fs=
libvirt.org/go/libvirtxml v1.12002.0/go.mod h1:7Oq2BLDstLr/XtoQD8Fr3mfDNrzlI3utYKySXF2xkng=
sigs.k8s.io/json v0.0.0-20250730193827-2d320260d730 h1:IpInykpT6ceI+QxKBbEflcR5EXP7sU1kvOlxwZh5txg=
sigs.k8s.io/json v0.0.0-20250730193827-2d320260d730/go.mod h1:mdzfpAEoE6DHQEN0uh9ZbOCuHbLK5wOm7dK4ctXE9Tg=
sigs.k8s.io/randfill v1.0.0 h1:JfjMILfT8A6RbawdsK2JXGBR5AQVfd+9TbzrlneTyrU=
Expand Down
7 changes: 7 additions & 0 deletions hack/test-container-image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
set -euo pipefail

BINK_IMAGE="${BINK_IMAGE:-ghcr.io/alicefr/bink/bink:latest}"
CLUSTER_IMAGE="${CLUSTER_IMAGE:-ghcr.io/alicefr/bink/cluster:latest}"
if [ -n "${CONTAINER_HOST:-}" ]; then
PODMAN_SOCK="${CONTAINER_HOST#unix://}"
elif [ -S "/run/podman/podman.sock" ]; then
Expand Down Expand Up @@ -56,6 +57,12 @@ run_test() {
# unreachable from inside nested podman networks. Override it so inner aardvark-dns
# forwards queries to a public resolver instead.
podman exec "${nested_container}" bash -c 'echo "nameserver 8.8.8.8" > /etc/resolv.conf'
# Pre-load locally-built images into the nested container to avoid
# pulling from the registry (which may also be stale).
if podman image exists "${CLUSTER_IMAGE}" 2>/dev/null; then
echo "Loading ${CLUSTER_IMAGE} into nested container..."
podman save "${CLUSTER_IMAGE}" | podman exec -i "${nested_container}" podman load
fi
bink_args=(podman exec "${nested_container}" bink)
;;
*)
Expand Down
1 change: 1 addition & 0 deletions internal/config/defaults.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ const (
ClusterMACPrefix = "52:54:01"

DefaultAPIServerPort = 6443
LibvirtTCPPort = 16509
ServiceCIDR = "10.96.0.0/12"

CalicoVersion = "v3.27.0"
Expand Down
3 changes: 3 additions & 0 deletions internal/node/cleanup.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
package node

func (n *Node) Cleanup() error {
if n.virsh != nil {
return n.virsh.Close()
}
return nil
}
122 changes: 39 additions & 83 deletions internal/node/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ package node
import (
"context"
"fmt"
"time"

"github.com/bootc-dev/bink/internal/config"
"github.com/bootc-dev/bink/internal/podman"
Expand Down Expand Up @@ -74,16 +73,21 @@ func (n *Node) createContainer(ctx context.Context) error {
},
CapAdd: []string{"SYS_ADMIN"},
SelinuxOpts: []string{"disable"},
}

if n.IsControlPlane {
opts.PortMappings = []nettypes.PortMapping{
PortMappings: []nettypes.PortMapping{
{
HostPort: uint16(n.APIPort),
ContainerPort: 6443,
HostPort: 0,
ContainerPort: uint16(config.LibvirtTCPPort),
Protocol: "tcp",
},
}
},
}

if n.IsControlPlane {
opts.PortMappings = append(opts.PortMappings, nettypes.PortMapping{
HostPort: uint16(n.APIPort),
ContainerPort: 6443,
Protocol: "tcp",
})
}

containerID, err := n.podman.ContainerCreate(ctx, opts)
Expand Down Expand Up @@ -174,89 +178,41 @@ func (n *Node) createOverlayDisk(ctx context.Context) error {
return nil
}

func (n *Node) waitForVirtqemud(ctx context.Context) error {
logrus.Debug("Waiting for virtqemud socket...")
for i := range 30 {
if err := ctx.Err(); err != nil {
return err
}

err := n.podman.ContainerExecQuiet(ctx, n.ContainerName,
[]string{"test", "-S", "/var/run/libvirt/virtqemud-sock"})
if err == nil {
logrus.Debug("virtqemud socket is ready")
return nil
}
if i == 29 {
return fmt.Errorf("virtqemud socket not ready after 30s")
}
select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(time.Second):
}
}
return nil
}

func (n *Node) createVM(ctx context.Context) error {
logrus.Infof("Creating VM %s", n.Name)

if err := n.waitForVirtqemud(ctx); err != nil {
return err
}

overlayDisk := fmt.Sprintf("path=/workspace/%s.qcow2,format=qcow2,bus=virtio", n.Name)
isoPath := fmt.Sprintf("path=/workspace/%s-cloud-init.iso,device=cdrom", n.Name)

maxMemory := n.MaxMemory
if maxMemory == 0 {
maxMemory = n.Memory
if n.Memory <= 0 || n.VCPUs <= 0 {
return fmt.Errorf("invalid VM configuration: memory=%d vcpus=%d (both must be positive)", n.Memory, n.VCPUs)
}

opts := &virsh.VirtInstallOptions{
Name: n.Name,
Memory: n.Memory,
MaxMemory: maxMemory,
VCPUs: n.VCPUs,
Disks: []string{overlayDisk, isoPath},
Networks: []virsh.NetworkConfig{
{
Type: "passt",
Model: "virtio",
PortForward: "2222:22",
},
{
Type: "mcast",
Model: "virtio",
MAC: n.ClusterMAC,
},
},
Filesystems: []virsh.FilesystemConfig{
{
Source: config.VirtiofsSharedDir,
Target: "cluster_images",
AccessMode: "passthrough",
ReadOnly: false,
},
},
XMLModifications: []string{
"xpath.set=./devices/interface[2]/source/@address=" + config.MulticastAddr,
fmt.Sprintf("xpath.set=./devices/interface[2]/source/@port=%d", config.MulticastPort),
"xpath.set=./devices/filesystem/source/@socket=" + config.VirtiofsSocketPath,
},
portForwards := []virsh.PortForward{
{Start: 2222, To: 22},
}

if n.IsControlPlane {
opts.XMLModifications = append(opts.XMLModifications,
"xpath.create=./devices/interface[1]/portForward/range",
"xpath.set=./devices/interface[1]/portForward/range[2]/@start=6443",
"xpath.set=./devices/interface[1]/portForward/range[2]/@to=6443",
)
}

if err := n.virsh.VirtInstall(ctx, opts); err != nil {
return fmt.Errorf("creating VM with virt-install: %w", err)
portForwards = append(portForwards, virsh.PortForward{Start: 6443, To: 6443})
}

opts := []virsh.DomainOption{
virsh.WithKVM(),
virsh.WithName(n.Name),
virsh.WithMemory(uint(n.Memory)),
virsh.WithVCPUs(uint(n.VCPUs)),
virsh.WithQ35OS(),
virsh.WithFeatures(),
virsh.WithCPUHostPassthrough(),
virsh.WithMemoryBackingForVirtiofs(),
virsh.WithDisk(fmt.Sprintf("/workspace/%s.qcow2", n.Name), "qcow2", "vda", "virtio"),
virsh.WithCDROM(fmt.Sprintf("/workspace/%s-cloud-init.iso", n.Name)),
virsh.WithPasstInterface(portForwards),
virsh.WithMcastInterface(n.ClusterMAC, config.MulticastAddr, config.MulticastPort),
virsh.WithVirtiofsSocket(config.VirtiofsSocketPath, "cluster_images"),
virsh.WithSerialConsole(),
virsh.WithGuestAgent(),
}

if err := n.virsh.DefineAndStartDomain(ctx, opts...); err != nil {
return fmt.Errorf("creating VM: %w", err)
}

logrus.Infof("VM %s created with dual-NIC networking", n.Name)
Expand Down
6 changes: 6 additions & 0 deletions internal/node/node.go
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,12 @@ func (n *Node) Create(ctx context.Context) error {
return fmt.Errorf("creating container: %w", err)
}

libvirtPort, err := n.podman.GetPublishedPort(ctx, n.ContainerName, fmt.Sprintf("%d/tcp", config.LibvirtTCPPort))
if err != nil {
return fmt.Errorf("getting libvirt TCP port: %w", err)
}
n.virsh.SetLibvirtURI(fmt.Sprintf("qemu+tcp://localhost:%d/session", libvirtPort))

if err := n.setupSSHKeys(ctx); err != nil {
return fmt.Errorf("setting up SSH keys: %w", err)
}
Expand Down
Loading
Loading