Category: bug / infrastructure
Summary
The operator fails to deliver workflow files to the runner pod due to a DNS resolution failure when looking up the runner pod's internal cluster hostname. The runner pod starts successfully (RunnerStarted: True) but the workflow is never loaded because the operator cannot reach it.
Error
Failed to notify runner: Post "http://session-dep-bump-agentic-sandbox-repo-1777982576.ols-shared.svc.cluster.local:8001/workflow": dial tcp: lookup session-dep-bump-agentic-sandbox-repo-1777982576.ols-shared.svc.cluster.local on 172.30.0.10:53: no such host
Condition State
WorkflowReconciled:
Status: False
Reason: UpdateFailed
Message: Failed to notify runner: <DNS error above>
RunnerStarted: True
Steps to Reproduce
- Create a session with a workflow configuration
- Session runner pod starts successfully
- Operator attempts to POST workflow files to the runner at its internal cluster hostname
- Result: DNS lookup for the runner pod hostname fails with
no such host
- Workflow files are never loaded into the workspace; session is broken
Expected Behavior
Operator should successfully resolve the runner pod hostname and deliver workflow files after RunnerStarted is True.
Actual Behavior
- Runner pod is running and healthy
- Operator cannot resolve the internal cluster DNS name for the runner pod
WorkflowReconciled condition stays False with UpdateFailed
- Session is left in a broken state — workflow never loads
Notes
- DNS server is
172.30.0.10:53 (cluster DNS)
- The hostname pattern is
session-{name}.{namespace}.svc.cluster.local — possible mismatch between the service name the operator constructs and the actual service created for the pod
- This is a platform/infrastructure issue, not a workflow configuration issue
- May be a timing issue (operator tries to notify before the service/DNS entry is ready) or a naming mismatch
Filed via Amber Interview
Category: bug / infrastructure
Summary
The operator fails to deliver workflow files to the runner pod due to a DNS resolution failure when looking up the runner pod's internal cluster hostname. The runner pod starts successfully (
RunnerStarted: True) but the workflow is never loaded because the operator cannot reach it.Error
Condition State
Steps to Reproduce
no such hostExpected Behavior
Operator should successfully resolve the runner pod hostname and deliver workflow files after
RunnerStartedis True.Actual Behavior
WorkflowReconciledcondition staysFalsewithUpdateFailedNotes
172.30.0.10:53(cluster DNS)session-{name}.{namespace}.svc.cluster.local— possible mismatch between the service name the operator constructs and the actual service created for the podFiled via Amber Interview