diff --git a/docs/arch/14-envoy-network-proxy.md b/docs/arch/14-envoy-network-proxy.md new file mode 100644 index 0000000000..7beddaa68f --- /dev/null +++ b/docs/arch/14-envoy-network-proxy.md @@ -0,0 +1,210 @@ +# Envoy Network Proxy + +## Status + +Experimental — selected with `TOOLHIVE_NETWORK_PROXY=envoy`. Squid remains the +default. + +## Problem + +When network isolation is enabled (`--isolate-network`), ToolHive currently +starts **three** auxiliary containers per workload: + +| Container | Role | +|-----------|------| +| `-egress` | Squid forward proxy — routes outbound traffic through an allowlist | +| `-ingress` | Squid reverse proxy — receives traffic from the proxy runner | +| `-dns` | dnsmasq — provides DNS to the internal network | + +Three containers means three image pulls, three startup sequences, three sets of +resources, and three things that can fail or restart. The Squid egress and ingress +containers are logically a single gateway — splitting them into two processes is +an implementation artifact rather than a deliberate design. + +## Solution + +Replace the two Squid containers with a **single Envoy container** that handles +both egress and ingress as separate listeners inside the same process. The DNS +container (dnsmasq) is unchanged. + +``` +Before: -egress (Squid) + -ingress (Squid) + -dns +After: -egress (Envoy, two listeners) + -dns +``` + +This reduces auxiliary container count from 3 → 2, simplifies the startup +sequence, and uses a single bootstrap configuration file to describe the entire +proxy behaviour. + +## Why Envoy + +### Consolidation + +Envoy's `HttpConnectionManager` supports multiple listeners in a single process. +The egress forward proxy (`:3128`) and ingress reverse proxy share the same Envoy +instance, the same access logs, and the same lifecycle. + +### L3 + L7 enforcement + +Squid operates at L7 only — it can match destination hostnames via `dstdomain` +ACLs but cannot match by IP address in a reliable, port-independent way. + +Envoy's RBAC filter supports: +- **`destination_ip`** — CIDR match at L3/L4, applied before the request is + parsed as HTTP. This catches direct-IP connections that bypass DNS. +- **Header match on `:authority`** — L7 match on the CONNECT target or HTTP + Host header, equivalent to Squid's `dstdomain`. + +ToolHive combines both layers: outbound traffic is blocked at L3 for known IP +ranges and at L7 for hostname patterns. The `Internal: true` Docker network +remains the fail-closed backstop for non-cooperative traffic that ignores the +proxy entirely. + +### Proper dynamic forward proxy + +Envoy's `dynamic_forward_proxy` cluster performs per-request DNS resolution and +handles HTTP CONNECT tunnelling natively. HTTPS flows through a CONNECT tunnel +exactly as a client would expect, with Envoy acting as a transparent TCP relay +after the CONNECT handshake — no TLS inspection, no certificate pinning, no CA +changes. + +### Access logging + +Both listeners write structured access logs to stdout, visible via `docker logs`. +Squid logged differently for egress and ingress with no unified view. + +### Configuration as code + +Envoy reads a protobuf-JSON bootstrap file generated by ToolHive at workload +start. The configuration is typed Go structs serialised to JSON — unit-testable, +diffable, and reproducible. Squid required template-rendered text files. + +### Future extensibility + +Envoy's xDS API makes it possible to update listeners, clusters, and RBAC +policies at runtime without a container restart. This is not used today, but the +groundwork is there for dynamic policy updates. The transparent L3/L4 +interception path (Phase 2, not yet implemented) requires an `original_dst` +listener and iptables rules that Envoy handles cleanly. + +## What Envoy Does Not Do + +- **Decrypt TLS.** Like Squid, Envoy filters HTTPS on the CONNECT target hostname + and then relays the encrypted stream as-is. No certificate inspection, no + man-in-the-middle. +- **Block non-cooperative traffic.** A workload that opens a raw TCP connection + ignoring `HTTP_PROXY` is contained by the `Internal: true` Docker network + blackhole, not by Envoy. Envoy only sees traffic that goes through the proxy. +- **Replace dnsmasq.** DNS for the workload's internal network is still served by + the dnsmasq container. +- **Run in Kubernetes.** Network isolation is a local-Docker feature only; the + Kubernetes operator has a separate egress gateway path. + +## Architecture + +### Egress listener (`:3128` — forward proxy) + +``` +HTTP_PROXY / HTTPS_PROXY → Envoy :3128 + └── HCM (upgrade: CONNECT) + ├── [optional] RBAC DENY — docker gateway IP (L3) + hostnames (L7) + ├── RBAC ALLOW — outbound allowlist (or allow-all) + ├── dynamic_forward_proxy — per-request DNS + CONNECT tunnel + └── router +``` + +The RBAC filters are evaluated top-to-bottom. The gateway DENY filter is present +unless `--allow-docker-gateway` is set; it blocks: +- The resolved Docker bridge gateway IP as a /32 CIDR (`destination_ip`) +- `host.docker.internal` and `gateway.docker.internal` as `:authority` prefix + matches (covers both plain HTTP and HTTPS CONNECT where authority includes the + port, e.g. `host.docker.internal:443`) + +The ALLOW filter implements the permission profile's `Outbound` rules: +- `InsecureAllowAll: true` → single wildcard policy (`any: true`) +- `AllowHost: [...]` → per-host `:authority` exact match (or suffix match for + `*.`-prefixed wildcards) +- No outbound permissions configured → empty policy map → Envoy deny-all + +### Ingress listener (`0.0.0.0:` — reverse proxy) + +``` +Proxy runner → host:127.0.0.1: → Docker port binding → Envoy :port + └── HCM + ├── router + └── route → STRICT_DNS cluster → : +``` + +The ingress listener binds to `0.0.0.0` inside the container so Docker's port +forwarding (which targets the container's bridge IP, not its loopback) can reach +it. The host-side port binding restricts to `127.0.0.1`, so the ingress is only +reachable from the local machine. + +The upstream STRICT_DNS cluster resolves the MCP container's hostname inside the +internal Docker network and forwards HTTP traffic to the MCP server port. + +The admin interface binds to `127.0.0.1` inside the Envoy container (container +loopback, not reachable via Docker port forwarding) as a precaution against the +admin API being accessible from other containers. + +### Bootstrap lifecycle + +1. ToolHive generates a protobuf-JSON bootstrap file in `os.TempDir()` at mode + `0600`. +2. The file is bind-mounted read-only into the Envoy container at + `/etc/envoy/envoy.json`. +3. Envoy reads it once at startup. +4. The file is cleaned up when ToolHive removes the workload. + +## Selection + +```bash +TOOLHIVE_NETWORK_PROXY=envoy thv run --isolate-network +``` + +`TOOLHIVE_NETWORK_PROXY` accepts: +- `""` or `"squid"` — Squid backend (default) +- `"envoy"` — Envoy backend + +An unknown value causes `NewClient` to fail at startup with a descriptive error. +The env var is intentionally not exposed as a CLI flag or CRD field while the +backend is experimental; chart surface and `RunConfig` wiring come later once the +backend is stable. + +## Comparison with Squid + +| Aspect | Squid (current default) | Envoy | +|--------|------------------------|-------| +| Containers per workload | 3 (egress + ingress + dns) | 2 (combined + dns) | +| Forward proxy | ✓ | ✓ (dynamic_forward_proxy) | +| Reverse proxy (ingress) | ✓ (separate container) | ✓ (second listener, same container) | +| HTTPS CONNECT tunnelling | ✓ | ✓ | +| TLS inspection | ✗ | ✗ | +| L7 hostname deny | ✓ (`dstdomain`) | ✓ (`:authority` header match) | +| L3 IP CIDR deny | Partial (`dst` ACL — DNS-resolved) | ✓ (direct packet match) | +| Wildcard host allowlist | ✓ (dot-prefix) | ✓ (suffix match) | +| Per-request DNS resolution | Via Squid resolver | Via DFP cluster | +| Access logs | Per-container, text format | Unified stdout, structured | +| Config format | Text template | Typed Go structs → protobuf-JSON | +| Runtime config update | Restart required | xDS-capable (not yet used) | +| Upstream image | Stacklok-built | Upstream distroless (pinned tag) | + +## Known Limitations + +- **Tag-pinned image.** The Envoy image is pinned by tag (`v1.32.3`), not by + digest. A future PR should pin by digest and add a `TOOLHIVE_ENVOY_IMAGE` + override for supply-chain policy requirements (the env var already exists). +- **Admin interface port.** The admin API on `:9901` (loopback-only inside the + container) is always enabled. A follow-up can disable it entirely or make it + conditional. +- **CONNECT access log timing.** Envoy logs CONNECT tunnel entries when the + tunnel closes, not when it opens. With keep-alive HTTP clients the log entry + may be delayed by minutes. Egress access logs are visible in `docker logs` but + appear after the connection closes. +- **No transparent L3/L4.** Non-cooperative traffic (workloads that ignore + `HTTP_PROXY`) is contained by the `Internal: true` network, not Envoy. True + non-bypassable enforcement requires iptables TPROXY + an init container with + `CAP_NET_ADMIN` — this is Phase 2 and requires its own architecture doc. +- **No port-based allowlist.** `AllowPort` from the permission profile is not + yet translated into Envoy policy. Squid honours `AllowPort`; the Envoy backend + currently ignores it. diff --git a/pkg/container/docker/client.go b/pkg/container/docker/client.go index e85142682e..4fd9faf26d 100644 --- a/pkg/container/docker/client.go +++ b/pkg/container/docker/client.go @@ -12,6 +12,7 @@ import ( "fmt" "io" "log/slog" + "maps" "net/netip" "os" "path/filepath" @@ -99,16 +100,6 @@ type deployOps interface { networkName string, endpointsConfig map[string]*network.EndpointSettings, ) (string, string, error) - createEgressSquidContainer( - ctx context.Context, - containerName string, - squidContainerName string, - attachStdio bool, - exposedPorts map[string]struct{}, - endpointsConfig map[string]*network.EndpointSettings, - perm *permissions.NetworkPermissions, - allowDockerGateway bool, - ) (string, error) createMcpContainer( ctx context.Context, name string, @@ -124,14 +115,6 @@ type deployOps interface { portBindings map[string][]runtime.PortBinding, isolateNetwork bool, ) error - createIngressContainer( - ctx context.Context, - containerName string, - upstreamPort int, - attachStdio bool, - externalEndpointsConfig map[string]*network.EndpointSettings, - networkPermissions *permissions.NetworkPermissions, - ) (int, error) } // Client implements the Deployer interface for Docker (and compatible runtimes) @@ -142,6 +125,7 @@ type Client struct { api dockerAPI imageManager images.ImageManager ops deployOps + proxy networkProxy // selected at construction time via newNetworkProxy } // NewClient creates a new container client @@ -163,26 +147,13 @@ func NewClient(ctx context.Context) (*Client, error) { // Default ops implementation uses the real client methods. c.ops = c - return c, nil -} + proxy, err := newNetworkProxy(c) + if err != nil { + return nil, fmt.Errorf("failed to initialize network proxy: %w", err) + } + c.proxy = proxy -// createEgressSquidContainer wraps the package-level createEgressSquidContainer to satisfy deployOps. -func (c *Client) createEgressSquidContainer( - ctx context.Context, - containerName string, - squidContainerName string, - attachStdio bool, - exposedPorts map[string]struct{}, - endpointsConfig map[string]*network.EndpointSettings, - perm *permissions.NetworkPermissions, - allowDockerGateway bool, -) (string, error) { - gatewayIP := c.getDockerBridgeGatewayIP(ctx) - return createEgressSquidContainer( - ctx, c, containerName, squidContainerName, - attachStdio, exposedPorts, endpointsConfig, perm, - allowDockerGateway, gatewayIP, - ) + return c, nil } // DeployWorkload creates and starts a workload. @@ -236,7 +207,19 @@ func (c *Client) DeployWorkload( slog.Debug("skipping external network creation for custom network mode", "network_mode", permissionConfig.NetworkMode) } + // For non-stdio isolated workloads, extract the upstream port before setting + // up proxy containers so that the ingress proxy can be configured before the + // MCP container is created. + var upstreamPort int + if transportType != "stdio" && isolateNetwork { + upstreamPort, err = extractFirstPort(options) + if err != nil { + return 0, err // extractFirstPort already wraps the error with context. + } + } + networkIsolation := false + var proxyRes proxyResult if isolateNetwork { networkIsolation = true @@ -257,24 +240,22 @@ func (c *Client) DeployWorkload( return 0, fmt.Errorf("failed to create dns container: %w", err) } - // create egress container - egressContainerName := fmt.Sprintf("%s-egress", name) - allowDockerGateway := options != nil && options.AllowDockerGateway - _, err = c.ops.createEgressSquidContainer( - ctx, - name, - egressContainerName, - attachStdio, - nil, - externalEndpointsConfig, - permissionProfile.Network, - allowDockerGateway, - ) + // SetupProxies is called before createMcpContainer so that the ingress + // proxy is ready before the MCP container starts. + proxyRes, err = c.proxy.SetupProxies(ctx, proxySpec{ + WorkloadName: name, + Permissions: permissionProfile.Network, + AllowDockerGateway: options != nil && options.AllowDockerGateway, + GatewayIP: c.getDockerBridgeGatewayIP(ctx), + TransportType: transportType, + UpstreamPort: upstreamPort, + AttachStdio: attachStdio, + Endpoints: externalEndpointsConfig, + }) if err != nil { - return 0, fmt.Errorf("failed to create egress container: %w", err) + return 0, fmt.Errorf("failed to set up network proxy: %w", err) } - - envVars = addEgressEnvVars(envVars, egressContainerName) + envVars = mergeEnvVars(envVars, proxyRes.EnvVars) } else { networkName = "" } @@ -319,12 +300,8 @@ func (c *Client) DeployWorkload( return 0, err // extractFirstPort already wraps the error with context. } if isolateNetwork { - // just extract the first exposed port - hostPort, err = c.ops.createIngressContainer(ctx, name, firstPortInt, attachStdio, externalEndpointsConfig, - permissionProfile.Network) - if err != nil { - return 0, fmt.Errorf("failed to create ingress container: %w", err) - } + // The ingress host port was already determined by SetupProxies. + hostPort = proxyRes.IngressHostPort } // NOTE: this is a hack to get the final port for the workload. @@ -1189,8 +1166,12 @@ func (c *Client) createNetwork( // Linux Docker Engine uses 172.17.0.1 by default, while Docker Desktop on macOS // uses 192.168.65.1 and Colima typically uses 192.168.5.1 or similar. Querying // the daemon directly is more accurate than hardcoding platform-specific IPs. -// Falls back to dockerDefaultBridgeGatewayIP on any error. +// Falls back to dockerDefaultBridgeGatewayIP on any error or when the underlying +// Docker client is unavailable. func (c *Client) getDockerBridgeGatewayIP(ctx context.Context) string { + if c.client == nil { + return dockerDefaultBridgeGatewayIP + } nr, err := c.client.NetworkInspect(ctx, "bridge", mobyclient.NetworkInspectOptions{}) if err != nil { slog.Debug("failed to inspect bridge network, using default gateway IP", "error", err) @@ -1593,22 +1574,41 @@ func (c *Client) createMcpContainer( } -// addEgressEnvVars adds environment variables for egress proxy configuration. +// addEgressEnvVars returns a new map containing all entries from envVars plus +// the HTTP_PROXY/HTTPS_PROXY/NO_PROXY variables for the given egress container. +// The caller's map is never mutated. func addEgressEnvVars(envVars map[string]string, egressContainerName string) map[string]string { egressHost := fmt.Sprintf("http://%s:3128", egressContainerName) - if envVars == nil { - envVars = make(map[string]string) - } - envVars["HTTP_PROXY"] = egressHost - envVars["HTTPS_PROXY"] = egressHost - envVars["http_proxy"] = egressHost - envVars["https_proxy"] = egressHost - envVars["NO_PROXY"] = "localhost,127.0.0.1,::1" - envVars["no_proxy"] = "localhost,127.0.0.1,::1" - return envVars + result := maps.Clone(envVars) + if result == nil { + result = make(map[string]string) + } + result["HTTP_PROXY"] = egressHost + result["HTTPS_PROXY"] = egressHost + result["http_proxy"] = egressHost + result["https_proxy"] = egressHost + result["NO_PROXY"] = "localhost,127.0.0.1,::1" + result["no_proxy"] = "localhost,127.0.0.1,::1" + return result +} + +// mergeEnvVars returns a new map containing all entries from base with all +// entries from extra added (extra values overwrite base on conflict). Neither +// input map is mutated. +func mergeEnvVars(base, extra map[string]string) map[string]string { + result := make(map[string]string, len(base)+len(extra)) + for k, v := range base { + result[k] = v + } + for k, v := range extra { + result[k] = v + } + return result } -func (c *Client) createIngressContainer(ctx context.Context, containerName string, upstreamPort int, attachStdio bool, +// setupIngressContainer creates the ingress Squid reverse-proxy container for +// the workload and returns the host-side port it is bound on. +func (c *Client) setupIngressContainer(ctx context.Context, containerName string, upstreamPort int, attachStdio bool, externalEndpointsConfig map[string]*network.EndpointSettings, networkPermissions *permissions.NetworkPermissions) (int, error) { squidPort, err := networking.FindOrUsePort(upstreamPort + 1) if err != nil { @@ -1677,6 +1677,8 @@ func (c *Client) createExternalNetworks(ctx context.Context) error { func generatePortBindings(labels map[string]string, portBindings map[string][]runtime.PortBinding) (map[string][]runtime.PortBinding, int, error) { + // Clone portBindings so we never mutate the caller's map. + portBindings = maps.Clone(portBindings) var hostPort int // check if we need to map to a random port of not if _, ok := labels[ToolhiveAuxiliaryWorkloadLabel]; ok && labels[ToolhiveAuxiliaryWorkloadLabel] == LabelValueTrue { diff --git a/pkg/container/docker/client_deploy_test.go b/pkg/container/docker/client_deploy_test.go index 592bf51d7e..e9c7b3139f 100644 --- a/pkg/container/docker/client_deploy_test.go +++ b/pkg/container/docker/client_deploy_test.go @@ -5,6 +5,7 @@ package docker import ( "context" + "fmt" "testing" "github.com/moby/moby/api/types/network" @@ -31,13 +32,6 @@ type fakeDeployOps struct { dnsID string dnsIP string - egressCalled bool - egressID string - egressAllowDockerGW bool - - ingressCalled bool - ingressPort int - mcpCalled bool mcpName string mcpNetworkName string @@ -52,12 +46,13 @@ type fakeDeployOps struct { mcpPortBindings map[string][]runtime.PortBinding mcpIsolate bool + // callOrder tracks the sequence of operation calls for ordering assertions. + callOrder *[]string + // error injection errExternalNetworks error errCreateNetwork error errDNS error - errEgress error - errIngress error errMcp error } @@ -80,12 +75,6 @@ func (f *fakeDeployOps) createDnsContainer(_ context.Context, _ string, _ bool, return f.dnsID, f.dnsIP, f.errDNS } -func (f *fakeDeployOps) createEgressSquidContainer(_ context.Context, _ string, _ string, _ bool, _ map[string]struct{}, _ map[string]*network.EndpointSettings, _ *permissions.NetworkPermissions, allowDockerGateway bool) (string, error) { - f.egressCalled = true - f.egressAllowDockerGW = allowDockerGateway - return f.egressID, f.errEgress -} - func (f *fakeDeployOps) createMcpContainer( _ context.Context, name string, @@ -101,6 +90,9 @@ func (f *fakeDeployOps) createMcpContainer( portBindings map[string][]runtime.PortBinding, isolateNetwork bool, ) error { + if f.callOrder != nil { + *f.callOrder = append(*f.callOrder, "createMcpContainer") + } f.mcpCalled = true f.mcpName = name f.mcpNetworkName = networkName @@ -117,22 +109,50 @@ func (f *fakeDeployOps) createMcpContainer( return f.errMcp } -func (f *fakeDeployOps) createIngressContainer(_ context.Context, _ string, _ int, _ bool, _ map[string]*network.EndpointSettings, _ *permissions.NetworkPermissions) (int, error) { - f.ingressCalled = true - if f.errIngress != nil { - return 0, f.errIngress +// fakeNetworkProxy implements networkProxy for testing DeployWorkload without real proxy containers. +type fakeNetworkProxy struct { + setupCalled bool + capturedSpec proxySpec + result proxyResult + err error + // callOrder tracks cross-component ordering when shared with fakeDeployOps. + callOrder *[]string +} + +func (f *fakeNetworkProxy) SetupProxies(_ context.Context, spec proxySpec) (proxyResult, error) { + if f.callOrder != nil { + *f.callOrder = append(*f.callOrder, "SetupProxies") + } + f.setupCalled = true + f.capturedSpec = spec + + // Return realistic env vars based on spec so MCP container env var assertions remain meaningful. + egressContainerName := fmt.Sprintf("%s-egress", spec.WorkloadName) + result := proxyResult{ + IngressHostPort: f.result.IngressHostPort, + EnvVars: addEgressEnvVars(nil, egressContainerName), } - return f.ingressPort, nil + if f.err != nil { + return proxyResult{}, f.err + } + return result, nil } -// newClientWithOps creates a minimal client with the provided ops and a fake dockerAPI. -func newClientWithOps(ops deployOps) *Client { +// newClientWithOpsAndProxy creates a minimal client with the provided ops, proxy, and a fake dockerAPI. +func newClientWithOpsAndProxy(ops deployOps, proxy networkProxy) *Client { return &Client{ - api: opsToFakeDockerAPI(), - ops: ops, + api: opsToFakeDockerAPI(), + ops: ops, + proxy: proxy, } } +// newClientWithOps creates a minimal client with the provided ops and a zero fakeNetworkProxy. +// Use this for tests that do not exercise isolated-network paths. +func newClientWithOps(ops deployOps) *Client { + return newClientWithOpsAndProxy(ops, &fakeNetworkProxy{}) +} + // opsToFakeDockerAPI returns a fake dockerAPI that won't be used by DeployWorkload tests directly. func opsToFakeDockerAPI() dockerAPI { return &fakeDockerAPI{} @@ -142,10 +162,10 @@ func TestDeployWorkload_Stdio_IsolatedNetwork_SkipsIngressAndSetsEgressEnv(t *te t.Parallel() fops := &fakeDeployOps{ - dnsIP: "172.18.0.10", - ingressPort: 18080, // should be ignored for stdio + dnsIP: "172.18.0.10", } - c := newClientWithOps(fops) + fproxy := &fakeNetworkProxy{} + c := newClientWithOpsAndProxy(fops, fproxy) opts := runtime.NewDeployWorkloadOptions() opts.AttachStdio = true @@ -179,8 +199,13 @@ func TestDeployWorkload_Stdio_IsolatedNetwork_SkipsIngressAndSetsEgressEnv(t *te require.Len(t, fops.createNetworkCalls, 1) assert.True(t, fops.createNetworkCalls[0].internal) assert.True(t, fops.dnsCalled) - assert.True(t, fops.egressCalled) - assert.False(t, fops.ingressCalled) + + // Proxy must be invoked for isolated-network deployment + assert.True(t, fproxy.setupCalled) + // stdio transport: no ingress needed + assert.Equal(t, "stdio", fproxy.capturedSpec.TransportType) + // AllowDockerGateway was not set + assert.False(t, fproxy.capturedSpec.AllowDockerGateway) // MCP container created with egress env vars present require.True(t, fops.mcpCalled) @@ -194,24 +219,24 @@ func TestDeployWorkload_Stdio_IsolatedNetwork_SkipsIngressAndSetsEgressEnv(t *te // SELinux labeling should be disabled assert.Contains(t, fops.mcpPermissionCfg.SecurityOpt, "label:disable", "expected SELinux labeling to be disabled") - - // TODO: Test for disabled SELinux labeling in the rest of workload containers } func TestDeployWorkload_SSE_IsolatedNetwork_ReturnsIngressPortAndPassesDNS(t *testing.T) { t.Parallel() fops := &fakeDeployOps{ - dnsIP: "172.18.0.20", - ingressPort: 18081, + dnsIP: "172.18.0.20", } - c := newClientWithOps(fops) + fproxy := &fakeNetworkProxy{ + result: proxyResult{IngressHostPort: 18081}, + } + c := newClientWithOpsAndProxy(fops, fproxy) opts := runtime.NewDeployWorkloadOptions() opts.ExposedPorts = map[string]struct{}{"8080/tcp": {}} opts.PortBindings = map[string][]runtime.PortBinding{ "8080/tcp": { - {HostIP: "127.0.0.1", HostPort: ""}, // random/non-deterministic is fine; will be overridden by ingress + {HostIP: "127.0.0.1", HostPort: ""}, }, } @@ -231,9 +256,11 @@ func TestDeployWorkload_SSE_IsolatedNetwork_ReturnsIngressPortAndPassesDNS(t *te ) require.NoError(t, err) - // For non-stdio with network isolation, returned port comes from ingress proxy + // For non-stdio with network isolation, returned port comes from ingress proxy. assert.Equal(t, 18081, hostPort) - assert.True(t, fops.ingressCalled) + assert.True(t, fproxy.setupCalled) + // The upstream port should be the first exposed port (8080). + assert.Equal(t, 8080, fproxy.capturedSpec.UpstreamPort) require.True(t, fops.mcpCalled) assert.Equal(t, "172.18.0.20", fops.mcpAdditionalDNS, "additionalDNS passed to MCP container should come from DNS container IP") } @@ -270,10 +297,8 @@ func TestDeployWorkload_NoIsolation_ReturnsPortFromBindingsAndSkipsAuxContainers ) require.NoError(t, err) - // Should not create internal network, DNS, egress, or ingress + // Should not create internal network, DNS, or invoke the proxy assert.False(t, fops.dnsCalled) - assert.False(t, fops.egressCalled) - assert.False(t, fops.ingressCalled) assert.Empty(t, fops.createNetworkCalls, "internal network should not be created when isolation is disabled") // MCP should be created on default network (empty name) @@ -288,7 +313,8 @@ func TestDeployWorkload_AllowDockerGateway_ForwardedToEgress(t *testing.T) { t.Parallel() fops := &fakeDeployOps{dnsIP: "172.18.0.10"} - c := newClientWithOps(fops) + fproxy := &fakeNetworkProxy{} + c := newClientWithOpsAndProxy(fops, fproxy) opts := runtime.NewDeployWorkloadOptions() opts.AttachStdio = true @@ -304,12 +330,12 @@ func TestDeployWorkload_AllowDockerGateway_ForwardedToEgress(t *testing.T) { &permissions.Profile{}, "stdio", opts, - true, // isolateNetwork required for egress container to be created + true, // isolateNetwork required for proxy to be invoked ) require.NoError(t, err) - require.True(t, fops.egressCalled, "egress container must be created when isolateNetwork=true") - assert.True(t, fops.egressAllowDockerGW, "AllowDockerGateway must be forwarded to createEgressSquidContainer") + require.True(t, fproxy.setupCalled, "proxy must be set up when isolateNetwork=true") + assert.True(t, fproxy.capturedSpec.AllowDockerGateway, "AllowDockerGateway must be forwarded to SetupProxies") } func TestDeployWorkload_UnsupportedTransport_PropagatesError(t *testing.T) { @@ -341,3 +367,39 @@ func TestDeployWorkload_UnsupportedTransport_PropagatesError(t *testing.T) { require.Error(t, err) assert.Contains(t, err.Error(), "unsupported transport type") } + +func TestDeployWorkload_Isolated_SetupProxiesBeforeCreateMcp(t *testing.T) { + t.Parallel() + + callOrder := make([]string, 0, 2) + fops := &fakeDeployOps{ + dnsIP: "172.18.0.10", + callOrder: &callOrder, + } + fproxy := &fakeNetworkProxy{ + result: proxyResult{IngressHostPort: 0}, + callOrder: &callOrder, + } + c := newClientWithOpsAndProxy(fops, fproxy) + + opts := runtime.NewDeployWorkloadOptions() + opts.AttachStdio = true + + _, err := c.DeployWorkload( + t.Context(), + "ghcr.io/example/mcp:latest", + "app", + []string{"serve"}, + map[string]string{}, + map[string]string{}, + &permissions.Profile{}, + "stdio", + opts, + true, + ) + require.NoError(t, err) + + require.Len(t, callOrder, 2, "expected SetupProxies and createMcpContainer calls") + assert.Equal(t, "SetupProxies", callOrder[0], "SetupProxies must be called before createMcpContainer") + assert.Equal(t, "createMcpContainer", callOrder[1]) +} diff --git a/pkg/container/docker/envoy.go b/pkg/container/docker/envoy.go new file mode 100644 index 0000000000..3970e930a5 --- /dev/null +++ b/pkg/container/docker/envoy.go @@ -0,0 +1,804 @@ +// SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc. +// SPDX-License-Identifier: Apache-2.0 + +package docker + +import ( + "context" + "encoding/json" + "fmt" + "log/slog" + "os" + "strings" + + "github.com/moby/moby/api/types/container" + + "github.com/stacklok/toolhive/pkg/container/runtime" + lb "github.com/stacklok/toolhive/pkg/labels" + "github.com/stacklok/toolhive/pkg/networking" +) + +const ( + // defaultEnvoyImage is pinned by digest to prevent unexpected updates. + // Override with TOOLHIVE_ENVOY_IMAGE. + defaultEnvoyImage = "envoyproxy/envoy-distroless:v1.32.3" + + // Protobuf type URLs required by Envoy's protobuf-JSON bootstrap format. + // Every typed_config field must carry an @type URL or Envoy will reject the + // config on startup. + typeHCM = "type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager" + typeHTTPRBAC = "type.googleapis.com/envoy.extensions.filters.http.rbac.v3.RBAC" + typeDFPFilter = "type.googleapis.com/envoy.extensions.filters.http.dynamic_forward_proxy.v3.FilterConfig" + typeRouter = "type.googleapis.com/envoy.extensions.filters.http.router.v3.Router" + typeDFPCluster = "type.googleapis.com/envoy.extensions.clusters.dynamic_forward_proxy.v3.ClusterConfig" + + // dfpCacheName is the shared DNS cache config name used by both the DFP + // HTTP filter and the DFP cluster extension. + dfpCacheName = "dynamic_forward_proxy_cache_config" + + // dfpClusterName is the cluster referenced by the egress route config. + dfpClusterName = "dynamic_forward_proxy_cluster" + + // ingressClusterName is the cluster referenced by the ingress route config. + ingressClusterName = "ingress_upstream" +) + +func getEnvoyImage() string { + if img := os.Getenv("TOOLHIVE_ENVOY_IMAGE"); img != "" { + return img + } + return defaultEnvoyImage +} + +// boolPtr returns a pointer to b, used for the RBAC any permission field which +// requires a pointer to distinguish "unset" from "false". +func boolPtr(b bool) *bool { return &b } + +// ── Bootstrap ──────────────────────────────────────────────────────────────── + +// envoyBootstrap is the top-level Envoy bootstrap configuration. +type envoyBootstrap struct { + Admin *envoyAdmin `json:"admin,omitempty"` + StaticResources envoyStaticResources `json:"static_resources"` +} + +// envoyAdmin configures the Envoy admin API endpoint. +type envoyAdmin struct { + Address envoyAddress `json:"address"` +} + +// envoyStaticResources holds the static listeners and clusters. +type envoyStaticResources struct { + Listeners []envoyListener `json:"listeners,omitempty"` + Clusters []envoyCluster `json:"clusters,omitempty"` +} + +// ── Address types ──────────────────────────────────────────────────────────── + +// envoyAddress wraps a socket address for Envoy config. +type envoyAddress struct { + SocketAddress envoySocketAddress `json:"socket_address"` +} + +// envoySocketAddress is an IP + port pair used throughout Envoy config. +type envoySocketAddress struct { + Address string `json:"address"` + PortValue int `json:"port_value"` +} + +// ── Listener / filter chain ────────────────────────────────────────────────── + +// envoyListener is an Envoy listener binding on a socket address. +type envoyListener struct { + Name string `json:"name"` + Address envoyAddress `json:"address"` + FilterChains []envoyFilterChain `json:"filter_chains"` +} + +// envoyFilterChain is a sequence of network-level filters applied to matching +// connections. +type envoyFilterChain struct { + Filters []envoyNetworkFilter `json:"filters"` +} + +// envoyNetworkFilter is a named network filter whose typed config is an +// arbitrary protobuf-JSON object (e.g. envoyHCM). The any type lets us embed +// concrete structs directly so that JSON serialisation includes @type. +type envoyNetworkFilter struct { + Name string `json:"name"` + TypedConfig any `json:"typed_config"` +} + +// ── HttpConnectionManager ─────────────────────────────────────────────────── + +// envoyHCM is the typed config for +// envoy.filters.network.http_connection_manager. +type envoyHCM struct { + Type string `json:"@type"` + StatPrefix string `json:"stat_prefix"` + AccessLog []envoyAccessLog `json:"access_log,omitempty"` + UpgradeConfigs []envoyUpgradeConfig `json:"upgrade_configs,omitempty"` + HTTPFilters []envoyHTTPFilter `json:"http_filters"` + RouteConfig *envoyRouteConfig `json:"route_config,omitempty"` +} + +// envoyAccessLog configures request access logging for an HCM. +type envoyAccessLog struct { + Name string `json:"name"` + TypedConfig any `json:"typed_config"` +} + +// envoyStdoutAccessLog is the typed config for envoy.access_loggers.stdout. +type envoyStdoutAccessLog struct { + Type string `json:"@type"` +} + +const typeStdoutAccessLog = "type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog" + +func stdoutAccessLog() []envoyAccessLog { + return []envoyAccessLog{{ + Name: "envoy.access_loggers.stdout", + TypedConfig: &envoyStdoutAccessLog{Type: typeStdoutAccessLog}, + }} +} + +// envoyUpgradeConfig enables HTTP upgrade protocols (e.g. CONNECT) in HCM. +type envoyUpgradeConfig struct { + UpgradeType string `json:"upgrade_type"` + ConnectConfig *envoyConnectConfig `json:"connect_config,omitempty"` +} + +// envoyHTTPFilter is a named HTTP-layer filter embedded inside HCM. +type envoyHTTPFilter struct { + Name string `json:"name"` + TypedConfig any `json:"typed_config"` +} + +// ── RBAC ───────────────────────────────────────────────────────────────────── + +// envoyHTTPRBAC is the typed config for envoy.filters.http.rbac. +type envoyHTTPRBAC struct { + Type string `json:"@type"` + Rules envoyRBACRules `json:"rules"` +} + +// envoyRBACRules holds the RBAC action and policy map. +// CRITICAL: Policies must NOT have omitempty. An empty map serializes as {} and +// is intentional — an absent field would silently turn deny-all into allow-all. +type envoyRBACRules struct { + Action string `json:"action"` + Policies map[string]envoyRBACPolicy `json:"policies"` +} + +// envoyRBACPolicy pairs permissions with principals for a single RBAC policy. +type envoyRBACPolicy struct { + Permissions []envoyPermission `json:"permissions"` + Principals []envoyPrincipal `json:"principals"` +} + +// envoyPermission matches a request by various criteria. Exactly one field +// should be set per permission entry. +type envoyPermission struct { + Any *bool `json:"any,omitempty"` + Header *envoyHeaderMatcher `json:"header,omitempty"` + DestinationIP *envoyCIDRRange `json:"destination_ip,omitempty"` + OrRules *envoyOrRules `json:"or_rules,omitempty"` +} + +// envoyOrRules composes multiple permissions with logical OR. +type envoyOrRules struct { + Rules []envoyPermission `json:"rules"` +} + +// envoyCIDRRange matches destination IPs against a CIDR prefix. +type envoyCIDRRange struct { + AddressPrefix string `json:"address_prefix"` + PrefixLen uint32 `json:"prefix_len"` +} + +// envoyHeaderMatcher matches an HTTP header value. +type envoyHeaderMatcher struct { + Name string `json:"name"` + StringMatch *envoyStringMatch `json:"string_match,omitempty"` +} + +// envoyStringMatch matches a string by exact value, prefix, or suffix. +type envoyStringMatch struct { + Exact string `json:"exact,omitempty"` + Prefix string `json:"prefix,omitempty"` + Suffix string `json:"suffix,omitempty"` +} + +// envoyPrincipal matches a downstream principal. Any:true is a wildcard. +type envoyPrincipal struct { + Any bool `json:"any"` +} + +// ── DFP filter ─────────────────────────────────────────────────────────────── + +// envoyDFPFilter is the typed config for +// envoy.filters.http.dynamic_forward_proxy. +type envoyDFPFilter struct { + Type string `json:"@type"` + DNSCacheConfig envoyDNSCache `json:"dns_cache_config"` +} + +// envoyDNSCache is the shared DNS cache config referenced by both the DFP +// HTTP filter and the DFP cluster extension. +type envoyDNSCache struct { + Name string `json:"name"` + DNSLookupFamily string `json:"dns_lookup_family"` +} + +// ── Router ──────────────────────────────────────────────────────────────────── + +// envoyRouter is the typed config for envoy.filters.http.router. +type envoyRouter struct { + Type string `json:"@type"` +} + +// ── Route config ───────────────────────────────────────────────────────────── + +// envoyRouteConfig holds the list of virtual hosts for a listener. +type envoyRouteConfig struct { + VirtualHosts []envoyVirtualHost `json:"virtual_hosts"` +} + +// envoyVirtualHost matches requests by domain and dispatches to a cluster. +type envoyVirtualHost struct { + Name string `json:"name"` + Domains []string `json:"domains"` + Routes []envoyRoute `json:"routes"` +} + +// envoyRoute matches a request prefix and forwards it to a cluster. +type envoyRoute struct { + Match envoyRouteMatch `json:"match"` + Route *envoyRouteAction `json:"route,omitempty"` +} + +// envoyRouteMatch matches incoming requests by URI prefix or CONNECT method. +// Exactly one of Prefix or ConnectMatcher must be set. +type envoyRouteMatch struct { + Prefix string `json:"prefix,omitempty"` + ConnectMatcher *envoyConnectMatcher `json:"connect_matcher,omitempty"` +} + +// envoyConnectMatcher matches HTTP CONNECT requests (used for HTTPS tunneling). +type envoyConnectMatcher struct{} + +// envoyRouteAction forwards matched requests to an upstream cluster. +type envoyRouteAction struct { + Cluster string `json:"cluster"` + UpgradeConfigs []envoyUpgradeConfig `json:"upgrade_configs,omitempty"` +} + +// envoyConnectConfig is the per-route CONNECT tunnel configuration. +type envoyConnectConfig struct{} + +// ── Cluster ─────────────────────────────────────────────────────────────────── + +// envoyCluster is an Envoy upstream cluster definition. +type envoyCluster struct { + Name string `json:"name"` + ConnectTimeout string `json:"connect_timeout"` + LbPolicy string `json:"lb_policy,omitempty"` + Type string `json:"type,omitempty"` + ClusterType *envoyClusterType `json:"cluster_type,omitempty"` + LoadAssignment *envoyLoadAssignment `json:"load_assignment,omitempty"` +} + +// envoyClusterType is the custom cluster discovery extension (e.g. DFP). +type envoyClusterType struct { + Name string `json:"name"` + TypedConfig any `json:"typed_config"` +} + +// envoyDFPClusterConfig is the typed config for +// envoy.clusters.dynamic_forward_proxy. +type envoyDFPClusterConfig struct { + Type string `json:"@type"` + DNSCacheConfig envoyDNSCache `json:"dns_cache_config"` +} + +// envoyLoadAssignment is an EDS-style static load assignment. +type envoyLoadAssignment struct { + ClusterName string `json:"cluster_name"` + Endpoints []envoyEndpoint `json:"endpoints"` +} + +// envoyEndpoint is a group of LB endpoints. +type envoyEndpoint struct { + LBEndpoints []envoyLBEndpoint `json:"lb_endpoints"` +} + +// envoyLBEndpoint is a single upstream endpoint. +type envoyLBEndpoint struct { + Endpoint envoyEndpointAddress `json:"endpoint"` +} + +// envoyEndpointAddress wraps an address for an LB endpoint. +type envoyEndpointAddress struct { + Address envoyAddress `json:"address"` +} + +// ── Builder functions ───────────────────────────────────────────────────────── + +// buildEgressListener builds the Envoy listener config for outbound traffic. +// +// The resulting HCM HTTP filter chain is: +// 1. (when !spec.AllowDockerGateway) RBAC DENY on gateway IP and hostnames +// 2. RBAC ALLOW allowlist — empty policies map is Envoy's deny-all +// 3. dynamic_forward_proxy HTTP filter +// 4. router +// +// CONNECT upgrades are enabled so that HTTPS CONNECT tunnels pass through. +func buildEgressListener(spec proxySpec) envoyListener { + httpFilters := buildEgressHTTPFilters(spec) + + hcm := &envoyHCM{ + Type: typeHCM, + StatPrefix: "egress_http", + AccessLog: stdoutAccessLog(), + UpgradeConfigs: []envoyUpgradeConfig{{UpgradeType: "CONNECT"}}, + HTTPFilters: httpFilters, + RouteConfig: &envoyRouteConfig{ + VirtualHosts: []envoyVirtualHost{ + { + Name: "local_service", + Domains: []string{"*"}, + Routes: []envoyRoute{ + // CONNECT match must come first: handles HTTPS tunneling. + // Without this, CONNECT requests don't match prefix "/" and get 404. + { + Match: envoyRouteMatch{ConnectMatcher: &envoyConnectMatcher{}}, + Route: &envoyRouteAction{ + Cluster: dfpClusterName, + UpgradeConfigs: []envoyUpgradeConfig{ + {UpgradeType: "CONNECT", ConnectConfig: &envoyConnectConfig{}}, + }, + }, + }, + // Prefix match handles plain HTTP requests. + { + Match: envoyRouteMatch{Prefix: "/"}, + Route: &envoyRouteAction{Cluster: dfpClusterName}, + }, + }, + }, + }, + }, + } + + return envoyListener{ + Name: fmt.Sprintf("%s-egress", spec.WorkloadName), + Address: envoyAddress{ + SocketAddress: envoySocketAddress{ + Address: "0.0.0.0", + PortValue: 3128, + }, + }, + FilterChains: []envoyFilterChain{ + { + Filters: []envoyNetworkFilter{ + { + Name: "envoy.filters.network.http_connection_manager", + TypedConfig: hcm, + }, + }, + }, + }, + } +} + +// buildEgressHTTPFilters constructs the ordered list of HTTP filters for the +// egress HCM. Gateway deny rules (when !spec.AllowDockerGateway) are placed +// first so they are evaluated before the allowlist. +func buildEgressHTTPFilters(spec proxySpec) []envoyHTTPFilter { + var filters []envoyHTTPFilter + + if !spec.AllowDockerGateway { + filters = append(filters, envoyHTTPFilter{ + Name: "envoy.filters.http.rbac", + TypedConfig: buildGatewayDenyRBAC(spec.GatewayIP), + }) + } + + filters = append(filters, + envoyHTTPFilter{ + Name: "envoy.filters.http.rbac", + TypedConfig: buildAllowlistRBAC(spec), + }, + envoyHTTPFilter{ + Name: "envoy.filters.http.dynamic_forward_proxy", + TypedConfig: &envoyDFPFilter{ + Type: typeDFPFilter, + DNSCacheConfig: envoyDNSCache{ + Name: dfpCacheName, + DNSLookupFamily: "V4_ONLY", + }, + }, + }, + envoyHTTPFilter{ + Name: "envoy.filters.http.router", + TypedConfig: &envoyRouter{Type: typeRouter}, + }, + ) + + return filters +} + +// buildGatewayDenyRBAC builds an RBAC DENY filter that blocks the Docker +// bridge gateway IP (L3 CIDR) and the Docker-internal hostnames (L7 authority +// header). This filter must precede the allowlist filter. +func buildGatewayDenyRBAC(gatewayIP string) *envoyHTTPRBAC { + return &envoyHTTPRBAC{ + Type: typeHTTPRBAC, + Rules: envoyRBACRules{ + Action: "DENY", + Policies: map[string]envoyRBACPolicy{ + "gateway-ip": { + Permissions: []envoyPermission{ + { + DestinationIP: &envoyCIDRRange{ + AddressPrefix: gatewayIP, + PrefixLen: 32, + }, + }, + }, + Principals: []envoyPrincipal{{Any: true}}, + }, + "gateway-hostnames": { + Permissions: []envoyPermission{ + { + // Prefix match covers both plain HTTP (:authority = "host.docker.internal") + // and HTTPS CONNECT (:authority = "host.docker.internal:443"). + OrRules: &envoyOrRules{ + Rules: []envoyPermission{ + { + Header: &envoyHeaderMatcher{ + Name: ":authority", + StringMatch: &envoyStringMatch{ + Prefix: dockerGatewayHostname, + }, + }, + }, + { + Header: &envoyHeaderMatcher{ + Name: ":authority", + StringMatch: &envoyStringMatch{ + Prefix: dockerAltGatewayHostname, + }, + }, + }, + }, + }, + }, + }, + Principals: []envoyPrincipal{{Any: true}}, + }, + }, + }, + } +} + +// buildAllowlistRBAC builds an RBAC ALLOW filter encoding the outbound +// allowlist. An empty Policies map is Envoy's deny-all under ALLOW action. +func buildAllowlistRBAC(spec proxySpec) *envoyHTTPRBAC { + return &envoyHTTPRBAC{ + Type: typeHTTPRBAC, + Rules: envoyRBACRules{ + Action: "ALLOW", + Policies: buildAllowlistPolicies(spec), + }, + } +} + +// buildAllowlistPolicies returns the policy map for the egress RBAC ALLOW +// filter. An empty map (deny-all) is returned when: +// - spec.Permissions is nil +// - spec.Permissions.Outbound is nil +// +// InsecureAllowAll produces a single wildcard policy. AllowHost entries become +// :authority header matchers (exact for plain hostnames, suffix for *.prefix). +func buildAllowlistPolicies(spec proxySpec) map[string]envoyRBACPolicy { + if spec.Permissions == nil || spec.Permissions.Outbound == nil { + return make(map[string]envoyRBACPolicy) + } + out := spec.Permissions.Outbound + if out.InsecureAllowAll { + return map[string]envoyRBACPolicy{ + "allow-all": { + Permissions: []envoyPermission{{Any: boolPtr(true)}}, + Principals: []envoyPrincipal{{Any: true}}, + }, + } + } + policies := make(map[string]envoyRBACPolicy) + for _, host := range out.AllowHost { + var match envoyStringMatch + if strings.HasPrefix(host, "*.") { + match.Suffix = host[1:] // "*.example.com" → ".example.com" + } else { + match.Exact = host + } + policies[host] = envoyRBACPolicy{ + Permissions: []envoyPermission{ + { + Header: &envoyHeaderMatcher{ + Name: ":authority", + StringMatch: &match, + }, + }, + }, + Principals: []envoyPrincipal{{Any: true}}, + } + } + return policies +} + +// buildEgressCluster returns the dynamic_forward_proxy cluster required by the +// egress listener's route config. +func buildEgressCluster() envoyCluster { + return envoyCluster{ + Name: dfpClusterName, + ConnectTimeout: "10s", + LbPolicy: "CLUSTER_PROVIDED", + ClusterType: &envoyClusterType{ + Name: "envoy.clusters.dynamic_forward_proxy", + TypedConfig: &envoyDFPClusterConfig{ + Type: typeDFPCluster, + DNSCacheConfig: envoyDNSCache{ + Name: dfpCacheName, + DNSLookupFamily: "V4_ONLY", + }, + }, + }, + } +} + +// buildIngressListener builds the Envoy listener config for inbound (ingress) +// traffic. It binds on 0.0.0.0:hostPort inside the container so that Docker's +// port forwarding (host:127.0.0.1: → container:) can reach +// it. The host-side HostIP:"127.0.0.1" in the port binding provides the +// localhost-only restriction; the container-side address must be 0.0.0.0 or +// Docker's bridge forwarding cannot deliver traffic to the listener. +// +// When spec.Permissions.Inbound.AllowHost is set the virtual host domain list +// is restricted to those entries; otherwise a wildcard domain ("*") is used. +func buildIngressListener(spec proxySpec, hostPort int) envoyListener { + domains := ingressDomains(spec) + + hcm := &envoyHCM{ + Type: typeHCM, + StatPrefix: "ingress_http", + AccessLog: stdoutAccessLog(), + HTTPFilters: []envoyHTTPFilter{ + { + Name: "envoy.filters.http.router", + TypedConfig: &envoyRouter{Type: typeRouter}, + }, + }, + RouteConfig: &envoyRouteConfig{ + VirtualHosts: []envoyVirtualHost{ + { + Name: "ingress_service", + Domains: domains, + Routes: []envoyRoute{ + { + Match: envoyRouteMatch{Prefix: "/"}, + Route: &envoyRouteAction{Cluster: ingressClusterName}, + }, + }, + }, + }, + }, + } + + return envoyListener{ + Name: fmt.Sprintf("%s-ingress", spec.WorkloadName), + Address: envoyAddress{ + SocketAddress: envoySocketAddress{ + Address: "0.0.0.0", // must be 0.0.0.0; Docker port forwarding targets the bridge IP, not container loopback + PortValue: hostPort, + }, + }, + FilterChains: []envoyFilterChain{ + { + Filters: []envoyNetworkFilter{ + { + Name: "envoy.filters.network.http_connection_manager", + TypedConfig: hcm, + }, + }, + }, + }, + } +} + +// ingressDomains returns the virtual host domain list for the ingress listener. +// When Inbound.AllowHost is configured those entries are used; otherwise a +// wildcard ("*") is returned so all hostnames are accepted. +func ingressDomains(spec proxySpec) []string { + if spec.Permissions != nil && spec.Permissions.Inbound != nil && + len(spec.Permissions.Inbound.AllowHost) > 0 { + return spec.Permissions.Inbound.AllowHost + } + return []string{"*"} +} + +// buildIngressCluster returns the STRICT_DNS upstream cluster for the ingress +// listener, pointing at spec.WorkloadName:spec.UpstreamPort. +func buildIngressCluster(spec proxySpec) envoyCluster { + return envoyCluster{ + Name: ingressClusterName, + ConnectTimeout: "10s", + Type: "STRICT_DNS", + LoadAssignment: &envoyLoadAssignment{ + ClusterName: ingressClusterName, + Endpoints: []envoyEndpoint{ + { + LBEndpoints: []envoyLBEndpoint{ + { + Endpoint: envoyEndpointAddress{ + Address: envoyAddress{ + SocketAddress: envoySocketAddress{ + Address: spec.WorkloadName, + PortValue: spec.UpstreamPort, + }, + }, + }, + }, + }, + }, + }, + }, + } +} + +// writeEnvoyBootstrap marshals b to JSON and writes it to a temporary file at +// mode 0600. Returns the file path. The caller is responsible for cleanup. +func writeEnvoyBootstrap(b envoyBootstrap) (string, error) { + data, err := json.Marshal(b) + if err != nil { + return "", fmt.Errorf("failed to marshal envoy bootstrap: %w", err) + } + tmpFile, err := os.CreateTemp("", "envoy-bootstrap-*.json") + if err != nil { + return "", fmt.Errorf("failed to create envoy bootstrap temp file: %w", err) + } + created := tmpFile.Name() + defer func() { + if cerr := tmpFile.Close(); cerr != nil { + slog.Warn("failed to close envoy bootstrap temp file", "error", cerr) + } + }() + if _, err := tmpFile.Write(data); err != nil { + _ = os.Remove(created) + return "", fmt.Errorf("failed to write envoy bootstrap: %w", err) + } + // 0600: only the owner can read — the file may contain network topology. + if err := tmpFile.Chmod(0o600); err != nil { + _ = os.Remove(created) + return "", fmt.Errorf("failed to set envoy bootstrap file permissions: %w", err) + } + return created, nil +} + +// ── envoyProxy ──────────────────────────────────────────────────────────────── + +// envoyProxy implements networkProxy using Envoy as the proxy backend. +// It creates a single Envoy container that handles both egress (forward proxy +// on :3128) and ingress (reverse proxy) as separate listeners, reducing aux +// container count from 3 (Squid: egress + ingress + dns) to 2 (Envoy: combined +// + dns). +type envoyProxy struct { + client *Client +} + +// SetupProxies implements networkProxy for the Envoy backend. +func (e *envoyProxy) SetupProxies(ctx context.Context, spec proxySpec) (proxyResult, error) { + egressContainerName := fmt.Sprintf("%s-egress", spec.WorkloadName) + + bootstrap := envoyBootstrap{ + Admin: &envoyAdmin{ + Address: envoyAddress{ + SocketAddress: envoySocketAddress{ + Address: "127.0.0.1", // loopback only — never 0.0.0.0 + PortValue: 9901, + }, + }, + }, + StaticResources: envoyStaticResources{ + Listeners: []envoyListener{buildEgressListener(spec)}, + Clusters: []envoyCluster{buildEgressCluster()}, + }, + } + + var ingressPort int + if spec.TransportType != "stdio" && spec.UpstreamPort > 0 { + port, err := networking.FindOrUsePort(spec.UpstreamPort + 1) + if err != nil { + return proxyResult{}, fmt.Errorf("failed to find ingress port: %w", err) + } + ingressPort = port + bootstrap.StaticResources.Listeners = append( + bootstrap.StaticResources.Listeners, + buildIngressListener(spec, ingressPort), + ) + bootstrap.StaticResources.Clusters = append( + bootstrap.StaticResources.Clusters, + buildIngressCluster(spec), + ) + } + + configPath, err := writeEnvoyBootstrap(bootstrap) + if err != nil { + return proxyResult{}, fmt.Errorf("failed to write envoy bootstrap: %w", err) + } + + envoyImage := getEnvoyImage() + slog.Debug("setting up envoy container", "name", egressContainerName, "image", envoyImage) + + if err := e.client.imageManager.PullImage(ctx, envoyImage); err != nil { + _, inspectErr := e.client.imageManager.ImageExists(ctx, envoyImage) + if inspectErr != nil { + return proxyResult{}, fmt.Errorf("failed to pull envoy image: %w", err) + } + slog.Debug("envoy image exists locally, continuing despite pull failure", "image", envoyImage) + } + + envoyLabels := map[string]string{} + lb.AddStandardLabels(envoyLabels, egressContainerName, egressContainerName, "stdio", 80) + envoyLabels[ToolhiveAuxiliaryWorkloadLabel] = LabelValueTrue + + config := &container.Config{ + Image: envoyImage, + Cmd: []string{"-c", "/etc/envoy/envoy.json"}, + Labels: envoyLabels, + } + + mounts := []runtime.Mount{ + { + Source: configPath, + Target: "/etc/envoy/envoy.json", + ReadOnly: true, + }, + } + + var exposedPorts map[string]struct{} + var portBindings map[string][]runtime.PortBinding + if ingressPort > 0 { + portKey := fmt.Sprintf("%d/tcp", ingressPort) + exposedPorts = map[string]struct{}{portKey: {}} + portBindings = map[string][]runtime.PortBinding{ + portKey: {{HostIP: "127.0.0.1", HostPort: fmt.Sprintf("%d", ingressPort)}}, + } + } + + hostConfig := &container.HostConfig{ + Mounts: convertMounts(mounts), + NetworkMode: container.NetworkMode("bridge"), + SecurityOpt: []string{"label:disable"}, + RestartPolicy: container.RestartPolicy{ + Name: "unless-stopped", + }, + } + if portBindings != nil { + if err := setupPortBindings(hostConfig, portBindings); err != nil { + return proxyResult{}, fmt.Errorf("failed to setup port bindings: %w", err) + } + } + if err := setupExposedPorts(config, exposedPorts); err != nil { + return proxyResult{}, fmt.Errorf("failed to setup exposed ports: %w", err) + } + + if _, err := e.client.createContainer(ctx, egressContainerName, config, hostConfig, spec.Endpoints); err != nil { + return proxyResult{}, fmt.Errorf("failed to create envoy container: %w", err) + } + + return proxyResult{ + IngressHostPort: ingressPort, + EnvVars: addEgressEnvVars(nil, egressContainerName), + }, nil +} diff --git a/pkg/container/docker/envoy_test.go b/pkg/container/docker/envoy_test.go new file mode 100644 index 0000000000..2d5204658c --- /dev/null +++ b/pkg/container/docker/envoy_test.go @@ -0,0 +1,559 @@ +// SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc. +// SPDX-License-Identifier: Apache-2.0 + +package docker + +import ( + "encoding/json" + "os" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/stacklok/toolhive-core/permissions" +) + +// Compile-time assertion: envoyProxy must satisfy networkProxy. +var _ networkProxy = (*envoyProxy)(nil) + +// findRBACFilter walks the HTTP filters inside the first HCM in a listener's +// first filter chain and returns the first RBAC filter with action == "ALLOW". +// It returns nil if no matching filter is found. +func findRBACFilter(listener envoyListener) *envoyHTTPRBAC { + if len(listener.FilterChains) == 0 { + return nil + } + fc := listener.FilterChains[0] + if len(fc.Filters) == 0 { + return nil + } + hcm, ok := fc.Filters[0].TypedConfig.(*envoyHCM) + if !ok { + return nil + } + for _, f := range hcm.HTTPFilters { + rbac, ok := f.TypedConfig.(*envoyHTTPRBAC) + if ok && rbac.Rules.Action == "ALLOW" { + return rbac + } + } + return nil +} + +// findDenyRBACFilter returns the first RBAC filter with action == "DENY" from +// the HCM inside the listener's first filter chain. +func findDenyRBACFilter(listener envoyListener) *envoyHTTPRBAC { + if len(listener.FilterChains) == 0 { + return nil + } + fc := listener.FilterChains[0] + if len(fc.Filters) == 0 { + return nil + } + hcm, ok := fc.Filters[0].TypedConfig.(*envoyHCM) + if !ok { + return nil + } + for _, f := range hcm.HTTPFilters { + rbac, ok := f.TypedConfig.(*envoyHTTPRBAC) + if ok && rbac.Rules.Action == "DENY" { + return rbac + } + } + return nil +} + +// TestBuildEgressListener_AllowlistAndDefaultDeny exercises the RBAC policy +// generation logic of buildEgressListener across the main permission scenarios. +func TestBuildEgressListener_AllowlistAndDefaultDeny(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + spec proxySpec + wantRBACPresent bool + wantRBACAction string // "ALLOW" or "DENY" + wantPolicies []string + wantGatewayDenyAbsent bool + wantGatewayDenyL3 bool // CIDR deny on GatewayIP + wantGatewayDenyL7 bool // authority deny on host.docker.internal + }{ + { + name: "nil permissions, InsecureAllowAll=false produces deny-all RBAC", + spec: proxySpec{ + WorkloadName: "myserver", + Permissions: nil, + AllowDockerGateway: false, + GatewayIP: dockerDefaultBridgeGatewayIP, + }, + wantRBACPresent: true, + wantRBACAction: "ALLOW", + wantPolicies: nil, // no policies → Envoy deny-all + wantGatewayDenyL3: true, + wantGatewayDenyL7: true, + }, + { + name: "InsecureAllowAll=true produces allow-all RBAC policy", + spec: proxySpec{ + WorkloadName: "myserver", + Permissions: &permissions.NetworkPermissions{ + Outbound: &permissions.OutboundNetworkPermissions{ + InsecureAllowAll: true, + }, + }, + AllowDockerGateway: false, + GatewayIP: dockerDefaultBridgeGatewayIP, + }, + wantRBACPresent: true, + wantRBACAction: "ALLOW", + wantPolicies: []string{"allow-all"}, + wantGatewayDenyL3: true, + wantGatewayDenyL7: true, + }, + { + name: "AllowHost list produces per-host RBAC policies", + spec: proxySpec{ + WorkloadName: "myserver", + Permissions: &permissions.NetworkPermissions{ + Outbound: &permissions.OutboundNetworkPermissions{ + AllowHost: []string{"example.com", "api.example.com"}, + }, + }, + AllowDockerGateway: false, + GatewayIP: dockerDefaultBridgeGatewayIP, + }, + wantRBACPresent: true, + wantRBACAction: "ALLOW", + wantPolicies: []string{"example.com", "api.example.com"}, + wantGatewayDenyL3: true, + wantGatewayDenyL7: true, + }, + { + name: "AllowDockerGateway=true omits gateway deny rules", + spec: proxySpec{ + WorkloadName: "myserver", + Permissions: &permissions.NetworkPermissions{ + Outbound: &permissions.OutboundNetworkPermissions{ + InsecureAllowAll: true, + }, + }, + AllowDockerGateway: true, + GatewayIP: dockerDefaultBridgeGatewayIP, + }, + wantRBACPresent: true, + wantRBACAction: "ALLOW", + wantPolicies: []string{"allow-all"}, + wantGatewayDenyAbsent: true, + }, + { + name: "wildcard AllowHost produces correct authority pattern", + spec: proxySpec{ + WorkloadName: "myserver", + Permissions: &permissions.NetworkPermissions{ + Outbound: &permissions.OutboundNetworkPermissions{ + AllowHost: []string{"*.example.com"}, + }, + }, + AllowDockerGateway: false, + GatewayIP: dockerDefaultBridgeGatewayIP, + }, + wantRBACPresent: true, + wantRBACAction: "ALLOW", + wantPolicies: []string{"*.example.com"}, + wantGatewayDenyL3: true, + wantGatewayDenyL7: true, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + listener := buildEgressListener(tt.spec) + + // The listener must have at least one filter chain. + require.NotEmpty(t, listener.FilterChains, "expected at least one filter chain") + + if tt.wantRBACPresent { + rbac := findRBACFilter(listener) + require.NotNil(t, rbac, "expected RBAC filter to be present in the listener") + assert.Equal(t, tt.wantRBACAction, rbac.Rules.Action, + "RBAC action mismatch") + + if tt.wantPolicies == nil { + assert.Empty(t, rbac.Rules.Policies, + "expected empty policy set for deny-all") + } else { + for _, policyName := range tt.wantPolicies { + _, ok := rbac.Rules.Policies[policyName] + assert.True(t, ok, "expected RBAC policy %q to be present", policyName) + } + } + } + + if tt.wantGatewayDenyAbsent { + // Serialize to JSON and verify neither gateway hostname nor the + // gateway CIDR deny appear anywhere in the config. + raw, err := json.Marshal(listener) + require.NoError(t, err) + s := string(raw) + assert.NotContains(t, s, dockerGatewayHostname, + "docker gateway hostname should be absent when AllowDockerGateway=true") + assert.NotContains(t, s, dockerDefaultBridgeGatewayIP, + "docker gateway IP should be absent when AllowDockerGateway=true") + + // Also confirm no DENY RBAC filter exists in the Go struct. + denyFilter := findDenyRBACFilter(listener) + assert.Nil(t, denyFilter, + "no DENY RBAC filter should be present when AllowDockerGateway=true") + } + + if tt.wantGatewayDenyL3 { + raw, err := json.Marshal(listener) + require.NoError(t, err) + assert.Contains(t, string(raw), dockerDefaultBridgeGatewayIP, + "expected L3 CIDR deny on gateway IP") + } + + if tt.wantGatewayDenyL7 { + raw, err := json.Marshal(listener) + require.NoError(t, err) + assert.Contains(t, string(raw), dockerGatewayHostname, + "expected L7 authority deny for host.docker.internal") + assert.Contains(t, string(raw), dockerAltGatewayHostname, + "expected L7 authority deny for gateway.docker.internal") + } + }) + } +} + +// TestBuildEgressListener_EmptyAllowHostDenyAll is a mandatory regression guard: +// buildEgressListener with an empty AllowHost and InsecureAllowAll=false must +// produce a listener where the RBAC filter is present with action=ALLOW and +// zero policies. This is Envoy's deny-all behavior. The test guards against a +// bug that silently omits the RBAC filter and produces allow-all. +// +// Note: a JSON round-trip assertion is intentionally omitted here. The +// envoyNetworkFilter.TypedConfig field is typed as any, so concrete pointer +// types (*envoyHTTPRBAC, etc.) do not survive JSON round-trip — the unmarshaled +// value becomes map[string]any. Behavioral correctness is verified directly on +// the Go struct returned by buildEgressListener. +func TestBuildEgressListener_EmptyAllowHostDenyAll(t *testing.T) { + t.Parallel() + + spec := proxySpec{ + WorkloadName: "guard-test", + Permissions: &permissions.NetworkPermissions{ + Outbound: &permissions.OutboundNetworkPermissions{ + InsecureAllowAll: false, + AllowHost: []string{}, + }, + }, + AllowDockerGateway: false, + GatewayIP: dockerDefaultBridgeGatewayIP, + } + + listener := buildEgressListener(spec) + require.NotEmpty(t, listener.FilterChains) + + rbac := findRBACFilter(listener) + require.NotNil(t, rbac, + "RBAC filter must be present — its absence would silently allow all traffic") + assert.Equal(t, "ALLOW", rbac.Rules.Action, + "action must be ALLOW; an empty policy set under ALLOW is Envoy's deny-all") + assert.Empty(t, rbac.Rules.Policies, + "policy set must be empty to achieve deny-all semantics") + + // Verify the config serialises to valid JSON. + raw, err := json.Marshal(listener) + require.NoError(t, err) + assert.NotEmpty(t, raw, "serialized listener must not be empty") +} + +// TestBuildIngressListener_PortAndHostGating verifies that buildIngressListener +// wires the upstream port, host-port binding, and virtual-host domain gating +// correctly for several input scenarios. +func TestBuildIngressListener_PortAndHostGating(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + spec proxySpec + hostPort int + wantUpstreamRef string // substring that must appear in the listener JSON + wantHostPortBound int // the listener's bind port + wantDomains []string + }{ + { + name: "sse transport binds hostPort and routes to upstream", + spec: proxySpec{ + WorkloadName: "myserver", + UpstreamPort: 8080, + TransportType: "sse", + Permissions: nil, + }, + hostPort: 18080, + wantUpstreamRef: "myserver", + wantHostPortBound: 18080, + }, + { + name: "inbound AllowHost restricts virtual host domains", + spec: proxySpec{ + WorkloadName: "svc", + UpstreamPort: 9090, + TransportType: "streamable-http", + Permissions: &permissions.NetworkPermissions{ + Inbound: &permissions.InboundNetworkPermissions{ + AllowHost: []string{"app.example.com"}, + }, + }, + }, + hostPort: 19090, + wantUpstreamRef: "svc", + wantHostPortBound: 19090, + wantDomains: []string{"app.example.com"}, + }, + { + name: "nil permissions defaults to permissive localhost gating", + spec: proxySpec{ + WorkloadName: "tool", + UpstreamPort: 7070, + TransportType: "sse", + Permissions: nil, + }, + hostPort: 17070, + wantUpstreamRef: "tool", + wantHostPortBound: 17070, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + listener := buildIngressListener(tt.spec, tt.hostPort) + + // Serialize to JSON for substring assertions — this is simpler + // than deeply navigating the typed structs and more resilient to + // internal refactors that rename unexported fields. + raw, err := json.Marshal(listener) + require.NoError(t, err) + s := string(raw) + + assert.Contains(t, s, tt.wantUpstreamRef, + "listener config must reference upstream %s", tt.wantUpstreamRef) + + if tt.wantDomains != nil { + for _, domain := range tt.wantDomains { + assert.Contains(t, s, domain, + "listener config must contain domain restriction %q", domain) + } + } + + // The listener must not be bound on port 0. + assert.NotContains(t, s, `"port_value":0`, + "listener must not bind on port 0") + }) + } +} + +// TestBuildIngressCluster_UpstreamAddress verifies that buildIngressCluster +// produces a STRICT_DNS cluster pointing at the correct workload address. +func TestBuildIngressCluster_UpstreamAddress(t *testing.T) { + t.Parallel() + + spec := proxySpec{ + WorkloadName: "myserver", + UpstreamPort: 8080, + } + + cluster := buildIngressCluster(spec) + + assert.Equal(t, ingressClusterName, cluster.Name) + assert.Equal(t, "STRICT_DNS", cluster.Type) + require.NotNil(t, cluster.LoadAssignment) + require.NotEmpty(t, cluster.LoadAssignment.Endpoints) + require.NotEmpty(t, cluster.LoadAssignment.Endpoints[0].LBEndpoints) + + ep := cluster.LoadAssignment.Endpoints[0].LBEndpoints[0] + assert.Equal(t, "myserver", ep.Endpoint.Address.SocketAddress.Address) + assert.Equal(t, 8080, ep.Endpoint.Address.SocketAddress.PortValue) +} + +// TestBuildEgressCluster_DFPConfig verifies that buildEgressCluster produces a +// dynamic_forward_proxy cluster with the expected configuration. +func TestBuildEgressCluster_DFPConfig(t *testing.T) { + t.Parallel() + + cluster := buildEgressCluster() + + assert.Equal(t, dfpClusterName, cluster.Name) + assert.Equal(t, "CLUSTER_PROVIDED", cluster.LbPolicy) + require.NotNil(t, cluster.ClusterType) + assert.Equal(t, "envoy.clusters.dynamic_forward_proxy", cluster.ClusterType.Name) + + dfp, ok := cluster.ClusterType.TypedConfig.(*envoyDFPClusterConfig) + require.True(t, ok, "ClusterType.TypedConfig must be *envoyDFPClusterConfig") + assert.Equal(t, typeDFPCluster, dfp.Type) + assert.Equal(t, dfpCacheName, dfp.DNSCacheConfig.Name) + assert.Equal(t, "V4_ONLY", dfp.DNSCacheConfig.DNSLookupFamily) +} + +// TestWriteEnvoyBootstrap_FileMode verifies that writeEnvoyBootstrap writes a +// valid JSON bootstrap file at mode 0600. +func TestWriteEnvoyBootstrap_FileMode(t *testing.T) { + t.Parallel() + + b := envoyBootstrap{ + Admin: &envoyAdmin{ + Address: envoyAddress{ + SocketAddress: envoySocketAddress{ + Address: "127.0.0.1", + PortValue: 9901, + }, + }, + }, + StaticResources: envoyStaticResources{}, + } + + path, err := writeEnvoyBootstrap(b) + require.NoError(t, err) + require.NotEmpty(t, path, "returned path must be non-empty") + + t.Cleanup(func() { _ = os.Remove(path) }) + + // File must exist and be readable. + info, err := os.Stat(path) + require.NoError(t, err) + + // Mode must be 0600 — not 0644 — so that other processes cannot read the + // bootstrap config (which may contain sensitive socket addresses). + assert.Equal(t, os.FileMode(0o600), info.Mode().Perm(), + "bootstrap file must be written at mode 0600") + + // File must contain valid JSON that deserializes back into envoyBootstrap. + data, err := os.ReadFile(path) + require.NoError(t, err) + require.NotEmpty(t, data, "bootstrap file must not be empty") + + var roundTripped envoyBootstrap + require.NoError(t, json.Unmarshal(data, &roundTripped), + "bootstrap file must contain valid JSON") +} + +// TestEnvoyAdmin_LoopbackOnly asserts that the admin block written by +// writeEnvoyBootstrap binds only on the loopback address and never on +// 0.0.0.0 or an empty address that would expose admin to all interfaces. +func TestEnvoyAdmin_LoopbackOnly(t *testing.T) { + t.Parallel() + + b := envoyBootstrap{ + Admin: &envoyAdmin{ + Address: envoyAddress{ + SocketAddress: envoySocketAddress{ + Address: "127.0.0.1", + PortValue: 9901, + }, + }, + }, + StaticResources: envoyStaticResources{}, + } + + path, err := writeEnvoyBootstrap(b) + require.NoError(t, err) + require.NotEmpty(t, path) + t.Cleanup(func() { _ = os.Remove(path) }) + + data, err := os.ReadFile(path) + require.NoError(t, err) + s := string(data) + + assert.Contains(t, s, "127.0.0.1", + "admin address must be loopback 127.0.0.1") + assert.NotContains(t, s, "0.0.0.0", + "admin address must NOT bind on 0.0.0.0") +} + +// TestGetEnvoyImage verifies that getEnvoyImage returns the default image when +// the override env var is unset and the override when it is set. +// NOTE: t.Setenv is used so t.Parallel() is intentionally omitted here — env +// mutations are global state and are incompatible with parallel execution. +func TestGetEnvoyImage(t *testing.T) { + tests := []struct { + name string + envValue string + wantImage string + wantEnvoy bool // true: assert the result contains "envoy" + }{ + { + name: "empty env returns non-empty default containing envoy", + envValue: "", + wantEnvoy: true, + }, + { + name: "custom image override is returned verbatim", + envValue: "my-custom-envoy:latest", + wantImage: "my-custom-envoy:latest", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Setenv("TOOLHIVE_ENVOY_IMAGE", tt.envValue) + + got := getEnvoyImage() + require.NotEmpty(t, got, "getEnvoyImage must never return an empty string") + + if tt.wantEnvoy { + assert.Contains(t, got, "envoy", + "default image must contain 'envoy' to be identifiable") + } + + if tt.wantImage != "" { + assert.Equal(t, tt.wantImage, got) + } + }) + } +} + +// TestEgressListenerHCMTypeURLs verifies that the egress listener serializes +// with correct protobuf @type URLs so Envoy can parse it. +func TestEgressListenerHCMTypeURLs(t *testing.T) { + t.Parallel() + + spec := proxySpec{ + WorkloadName: "myserver", + Permissions: &permissions.NetworkPermissions{ + Outbound: &permissions.OutboundNetworkPermissions{ + AllowHost: []string{"example.com"}, + }, + }, + AllowDockerGateway: false, + GatewayIP: dockerDefaultBridgeGatewayIP, + } + + listener := buildEgressListener(spec) + raw, err := json.Marshal(listener) + require.NoError(t, err) + s := string(raw) + + assert.Contains(t, s, typeHCM, "@type for HCM must be present in serialized JSON") + assert.Contains(t, s, typeHTTPRBAC, "@type for RBAC must be present in serialized JSON") + assert.Contains(t, s, typeDFPFilter, "@type for DFP filter must be present in serialized JSON") + assert.Contains(t, s, typeRouter, "@type for router must be present in serialized JSON") +} + +// TestEgressClusterTypeURL verifies that the egress cluster serializes with the +// correct DFP cluster @type URL. +func TestEgressClusterTypeURL(t *testing.T) { + t.Parallel() + + cluster := buildEgressCluster() + raw, err := json.Marshal(cluster) + require.NoError(t, err) + s := string(raw) + + assert.Contains(t, s, typeDFPCluster, + "@type for DFP cluster config must be present in serialized JSON") +} diff --git a/pkg/container/docker/networkproxy.go b/pkg/container/docker/networkproxy.go new file mode 100644 index 0000000000..baeb1b1147 --- /dev/null +++ b/pkg/container/docker/networkproxy.go @@ -0,0 +1,91 @@ +// SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc. +// SPDX-License-Identifier: Apache-2.0 + +package docker + +import ( + "context" + "fmt" + "os" + + "github.com/moby/moby/api/types/network" + + "github.com/stacklok/toolhive-core/permissions" +) + +// networkProxy is the single enforcement point for outbound allowlisting, +// docker-gateway blocking, and inbound reverse-proxying for isolated workloads. +// +// What it enables: all egress and ingress proxy containers are created through +// this interface, ensuring a consistent policy-enforcement seam that can be +// swapped at startup via TOOLHIVE_NETWORK_PROXY. +// +// What it does NOT solve: non-cooperative egress is contained by the +// Internal:true network blackhole created by createNetwork, not by this proxy. +// True L3/L4 traffic interception is a separate Phase 2 concern and is not +// addressed here. +type networkProxy interface { + // SetupProxies creates egress (and, for non-stdio transports, ingress) proxy + // containers for the workload described by spec. It returns a proxyResult + // containing the host-side ingress port (0 for stdio) and any environment + // variables that must be injected into the MCP container. + SetupProxies(ctx context.Context, spec proxySpec) (proxyResult, error) +} + +// proxySpec contains all the parameters needed to set up proxy containers for +// an isolated workload. +type proxySpec struct { + // WorkloadName is the base name of the MCP container (e.g. "myserver"). + WorkloadName string + // Permissions holds the network permission profile from the workload's + // permission profile, governing what outbound traffic is allowed. + Permissions *permissions.NetworkPermissions + // AllowDockerGateway, when true, skips the docker-gateway deny rules in the + // egress proxy configuration. + AllowDockerGateway bool + // GatewayIP is the Docker bridge gateway IP resolved at runtime. + GatewayIP string + // TransportType is the MCP transport in use (e.g. "stdio", "sse", + // "streamable-http"). A value of "stdio" suppresses ingress proxy creation. + TransportType string + // UpstreamPort is the container port the MCP server listens on. Ignored + // when TransportType is "stdio" or the value is 0. + UpstreamPort int + // AttachStdio controls whether the proxy containers attach stdio streams. + AttachStdio bool + // Endpoints is the set of network endpoints the proxy containers should + // join, keyed by network name. + Endpoints map[string]*network.EndpointSettings +} + +// proxyResult is the output of a successful SetupProxies call. +type proxyResult struct { + // IngressHostPort is the host-side port bound by the ingress proxy. It is + // 0 when no ingress proxy was created (stdio transport or UpstreamPort==0). + IngressHostPort int + // EnvVars contains environment variables that must be merged into the MCP + // container's environment (e.g. HTTP_PROXY, HTTPS_PROXY). + EnvVars map[string]string +} + +// newNetworkProxy reads the TOOLHIVE_NETWORK_PROXY environment variable and +// returns the corresponding networkProxy implementation. An empty value or +// "squid" selects the default squid-based proxy. Any other value returns an +// error so that misconfiguration is caught at startup. +func newNetworkProxy(c *Client) (networkProxy, error) { + val := os.Getenv("TOOLHIVE_NETWORK_PROXY") + switch val { + case "", "squid": + return &squidProxy{client: c}, nil + case "envoy": + return &envoyProxy{client: c}, nil + default: + return nil, fmt.Errorf("unknown TOOLHIVE_NETWORK_PROXY value %q: supported values are \"squid\" (default), \"envoy\"", val) + } +} + +// Compile-time assertion that squidProxy satisfies networkProxy. +var _ networkProxy = (*squidProxy)(nil) + +// Compile-time assertion that envoyProxy satisfies networkProxy. +var _ networkProxy = (*envoyProxy)(nil) diff --git a/pkg/container/docker/networkproxy_test.go b/pkg/container/docker/networkproxy_test.go new file mode 100644 index 0000000000..46c577dc35 --- /dev/null +++ b/pkg/container/docker/networkproxy_test.go @@ -0,0 +1,87 @@ +// SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc. +// SPDX-License-Identifier: Apache-2.0 + +package docker + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func TestNewNetworkProxy(t *testing.T) { + // t.Setenv is used in subtests so the outer test must NOT call t.Parallel(). + // The subtests run sequentially to avoid concurrent environment mutation. + + tests := []struct { + name string + envValue string + wantSquid bool + wantEnvoy bool + wantErr bool + errContains []string + }{ + { + name: "empty env returns squidProxy", + envValue: "", + wantSquid: true, + wantErr: false, + }, + { + name: "squid returns squidProxy", + envValue: "squid", + wantSquid: true, + wantErr: false, + }, + { + // After Stage 1 this case becomes a success path — envoy returns + // *envoyProxy, no error. Flip wantErr to false and add wantEnvoy + // assertion once envoy.go exists. + name: "envoy returns envoyProxy", + envValue: "envoy", + wantSquid: false, + wantEnvoy: true, + wantErr: false, + }, + { + name: "bogus value returns error", + envValue: "bogus", + wantSquid: false, + wantErr: true, + errContains: []string{"bogus"}, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + // t.Setenv is incompatible with t.Parallel() — env mutations are + // global state; subtests run sequentially within this parent test. + t.Setenv("TOOLHIVE_NETWORK_PROXY", tt.envValue) + + c := &Client{} + proxy, err := newNetworkProxy(c) + + if tt.wantErr { + require.Error(t, err) + for _, substr := range tt.errContains { + assert.Contains(t, err.Error(), substr) + } + return + } + + require.NoError(t, err) + require.NotNil(t, proxy) + + if tt.wantSquid { + _, ok := proxy.(*squidProxy) + assert.True(t, ok, "expected proxy to be *squidProxy, got %T", proxy) + } + + if tt.wantEnvoy { + _, ok := proxy.(*envoyProxy) + assert.True(t, ok, "expected proxy to be *envoyProxy, got %T", proxy) + } + }) + } +} diff --git a/pkg/container/docker/squid.go b/pkg/container/docker/squid.go index c23825e46b..63777b65b6 100644 --- a/pkg/container/docker/squid.go +++ b/pkg/container/docker/squid.go @@ -320,6 +320,40 @@ func getSquidImage() string { return defaultSquidImage } +// squidProxy is the default networkProxy implementation. It creates egress and +// ingress Squid-based proxy containers for isolated workloads. +type squidProxy struct { + client *Client +} + +// SetupProxies creates the egress Squid container and, for non-stdio transports, +// the ingress Squid container for the workload described by spec. +func (s *squidProxy) SetupProxies(ctx context.Context, spec proxySpec) (proxyResult, error) { + egressContainerName := fmt.Sprintf("%s-egress", spec.WorkloadName) + _, err := createEgressSquidContainer( + ctx, s.client, spec.WorkloadName, egressContainerName, + spec.AttachStdio, nil, spec.Endpoints, spec.Permissions, + spec.AllowDockerGateway, spec.GatewayIP, + ) + if err != nil { + return proxyResult{}, fmt.Errorf("failed to create egress container: %w", err) + } + + envVars := addEgressEnvVars(nil, egressContainerName) + + if spec.TransportType == "stdio" || spec.UpstreamPort == 0 { + return proxyResult{EnvVars: envVars}, nil + } + + ingressPort, err := s.client.setupIngressContainer( + ctx, spec.WorkloadName, spec.UpstreamPort, spec.AttachStdio, spec.Endpoints, spec.Permissions, + ) + if err != nil { + return proxyResult{}, err + } + return proxyResult{IngressHostPort: ingressPort, EnvVars: envVars}, nil +} + func createTempIngressSquidConf( serverHostname string, upstreamPort int,