An initial implementation of this was added in #85, but that PR has not moved, so I wanted to revisit that in issue form to try and gather consensus.
Background
Today Substrate ingress is tightly coupled to the bundled atenet-router deployment:
manifests/ate-install/atenet-router.yaml deploys one Substrate router container plus one proxy sidecar.
cmd/atenet/internal/router owns proxy configuration, deployment reconciliation, external request processing, actor ID extraction, ResumeActor, and upstream rewrite behavior.
- The current HTTP path assumes the actor ID is encoded in
Host as <actor-id>.actors.resources.substrate.ate.dev.
- The request is routed by resuming the actor through
ate-api-server, taking the returned worker pod IP, and rewriting the request target to <worker-ip>:80.
This works for the bundled demo and development path, but it makes it hard to integrate Substrate with other gateway stacks, managed load balancers, service meshes, platform gateways, or custom infrastructure.
Problem
Substrate's actor activation and routing decision logic is valuable independently from the current bundled proxy deployment. Operators should be able to choose how traffic enters while reusing the same Substrate semantics:
- Extract actor identity from a request.
- Resume or locate the actor.
- Return a routable backend endpoint.
- Preserve observability, status, errors, and security controls.
- Avoid putting Kubernetes control-plane convergence on the request path.
Right now these concerns are mixed together in cmd/atenet/internal/router, so replacing the ingress layer means replacing or forking core routing logic.
Proposal
Introduce a pluggable ingress/router boundary for Substrate.
At a high level, split the current router into:
-
Substrate routing core
- request metadata -> actor ID
- actor ID ->
ResumeActor
- actor status/backend validation
- route result / error mapping
- metrics and tracing
-
Ingress adapters
- current bundled proxy adapter
- adapters for any gateway/proxy that can delegate request processing to Substrate
- ExtProc-compatible integrations
- possible future generic gRPC/HTTP route-resolution APIs for integrations that do not use ExtProc
The current implementation should remain the default install path for now, but it should be treated as one adapter, not as the architecture. The design should not assume any specific gateway implementation. In particular, it should avoid baking any specific proxy's xDS, filter-chain, deployment, or configuration concepts into the Substrate routing API.
Substrate's stable contract should be about actor-aware routing:
- Extract actor identity from request metadata.
- Resume or locate the actor.
- Return a routable backend endpoint or a client-safe error.
- Preserve low-latency routing, observability, and policy hooks.
This routing contract should also align with the transparent egress/tunneling model detailed in #126. In that model, actor egress may be captured by a trusted runtime component and sent to a policy enforcement point (PEP) with signed actor identity and original-destination metadata. The PEP should be able to enforce policy without learning how to directly resume or locate Substrate actors.
For actor-to-actor traffic, the egress path should be able to route allowed requests back through Substrate ingress using the target actor authority. Substrate ingress should continue to own target actor lookup, resume, and worker routing. The PEP should only need to preserve enough target identity for ingress to do its job.
Non-goals
- Do not make Substrate depend on a specific ingress controller or gateway project.
- Do not design around a specific gateway implementation.
- Do not make Gateway API support imply a specific Gateway implementation.
- Do not move mutable ingress policy into immutable
ActorTemplate.spec unless no better API boundary exists.
Prior art: KServe and Gateway API Inference Extension
KServe's generative inference path uses Gateway API and Gateway API Inference Extension as part of the LLM serving stack. The relevant pattern is not the model-serving details themselves, but the separation of concerns:
- Gateway API provides the common ingress and routing resource model.
- A domain-specific extension adds workload-aware routing behavior.
- A separate endpoint picker/router component makes specialized backend decisions.
- ExtProc is used as the protocol between the gateway data plane and that external decision component.
Gateway API Inference Extension's Endpoint Picker Protocol is especially relevant. The Endpoint Picker is responsible for choosing an endpoint from an InferencePool, and the protocol is explicitly defined between the data plane and the picker. The picker implements Envoy's external processing service protocol and returns routing decisions or immediate responses.
This is analogous to what Substrate needs, with different domain semantics:
- Inference Extension: request metadata -> model/inference context -> selected model-serving endpoint.
- Substrate ingress: request metadata -> actor identity -> resume/locate actor -> selected worker endpoint or denial.
The useful lesson is that ExtProc can be a practical first integration protocol for specialized, request-time routing intelligence while the higher-level API remains domain-specific and Kubernetes-native. For Substrate, the domain-specific contract should be actor-aware routing and activation, not inference scheduling.
Possible design
Internal routing interface
Define an internal routing interface independent of any proxy:
type RouteRequest struct {
Host string
Path string
Method string
Headers map[string]string
}
type RouteResult struct {
ActorID string
Target string // host:port or structured endpoint
ActorTemplateNamespace string
ActorTemplateName string
}
type RouteResolver interface {
Resolve(ctx context.Context, req RouteRequest) (*RouteResult, error)
}
The existing request-processing server would become one caller of this interface instead of owning actor-aware routing directly.
External processing and route-resolution API
ExtProc is a reasonable first external integration protocol because it is already close to the current implementation and has prior art in adjacent Gateway API extension work. For example, Gateway API Inference Extension implementations commonly use external processing to inspect and mutate requests before forwarding them to model-serving backends.
Substrate can use ExtProc without making the design proxy-specific if the Substrate-owned semantic contract remains explicit:
- what request metadata Substrate needs,
- how actor identity is extracted,
- how trusted source identity is represented when present,
- what backend or denial result is returned,
- how errors map to downstream responses.
ExtProc can then be one wire protocol for that contract, not the only way to express the contract.
Optionally, Substrate can also expose a small Substrate-owned API for non-Go and non-bundled integrations that do not want to speak ExtProc:
service SubstrateRouter {
rpc ResolveRoute(ResolveRouteRequest) returns (ResolveRouteResponse);
}
This would allow an external ingress controller, gateway, service mesh extension, or platform proxy to call Substrate without importing Go packages or depending on one proxy-specific control-plane model.
The important design rule is that ExtProc should be treated as an adapter protocol, while the Substrate route-resolution semantics remain gateway-neutral.
An initial implementation of this was added in #85, but that PR has not moved, so I wanted to revisit that in issue form to try and gather consensus.
Background
Today Substrate ingress is tightly coupled to the bundled
atenet-routerdeployment:manifests/ate-install/atenet-router.yamldeploys one Substrate router container plus one proxy sidecar.cmd/atenet/internal/routerowns proxy configuration, deployment reconciliation, external request processing, actor ID extraction,ResumeActor, and upstream rewrite behavior.Hostas<actor-id>.actors.resources.substrate.ate.dev.ate-api-server, taking the returned worker pod IP, and rewriting the request target to<worker-ip>:80.This works for the bundled demo and development path, but it makes it hard to integrate Substrate with other gateway stacks, managed load balancers, service meshes, platform gateways, or custom infrastructure.
Problem
Substrate's actor activation and routing decision logic is valuable independently from the current bundled proxy deployment. Operators should be able to choose how traffic enters while reusing the same Substrate semantics:
Right now these concerns are mixed together in
cmd/atenet/internal/router, so replacing the ingress layer means replacing or forking core routing logic.Proposal
Introduce a pluggable ingress/router boundary for Substrate.
At a high level, split the current router into:
Substrate routing core
ResumeActorIngress adapters
The current implementation should remain the default install path for now, but it should be treated as one adapter, not as the architecture. The design should not assume any specific gateway implementation. In particular, it should avoid baking any specific proxy's xDS, filter-chain, deployment, or configuration concepts into the Substrate routing API.
Substrate's stable contract should be about actor-aware routing:
This routing contract should also align with the transparent egress/tunneling model detailed in #126. In that model, actor egress may be captured by a trusted runtime component and sent to a policy enforcement point (PEP) with signed actor identity and original-destination metadata. The PEP should be able to enforce policy without learning how to directly resume or locate Substrate actors.
For actor-to-actor traffic, the egress path should be able to route allowed requests back through Substrate ingress using the target actor authority. Substrate ingress should continue to own target actor lookup, resume, and worker routing. The PEP should only need to preserve enough target identity for ingress to do its job.
Non-goals
ActorTemplate.specunless no better API boundary exists.Prior art: KServe and Gateway API Inference Extension
KServe's generative inference path uses Gateway API and Gateway API Inference Extension as part of the LLM serving stack. The relevant pattern is not the model-serving details themselves, but the separation of concerns:
Gateway API Inference Extension's Endpoint Picker Protocol is especially relevant. The Endpoint Picker is responsible for choosing an endpoint from an
InferencePool, and the protocol is explicitly defined between the data plane and the picker. The picker implements Envoy's external processing service protocol and returns routing decisions or immediate responses.This is analogous to what Substrate needs, with different domain semantics:
The useful lesson is that ExtProc can be a practical first integration protocol for specialized, request-time routing intelligence while the higher-level API remains domain-specific and Kubernetes-native. For Substrate, the domain-specific contract should be actor-aware routing and activation, not inference scheduling.
Possible design
Internal routing interface
Define an internal routing interface independent of any proxy:
The existing request-processing server would become one caller of this interface instead of owning actor-aware routing directly.
External processing and route-resolution API
ExtProc is a reasonable first external integration protocol because it is already close to the current implementation and has prior art in adjacent Gateway API extension work. For example, Gateway API Inference Extension implementations commonly use external processing to inspect and mutate requests before forwarding them to model-serving backends.
Substrate can use ExtProc without making the design proxy-specific if the Substrate-owned semantic contract remains explicit:
ExtProc can then be one wire protocol for that contract, not the only way to express the contract.
Optionally, Substrate can also expose a small Substrate-owned API for non-Go and non-bundled integrations that do not want to speak ExtProc:
This would allow an external ingress controller, gateway, service mesh extension, or platform proxy to call Substrate without importing Go packages or depending on one proxy-specific control-plane model.
The important design rule is that ExtProc should be treated as an adapter protocol, while the Substrate route-resolution semantics remain gateway-neutral.