Skip to content

WorkflowReconciled fails with DNS resolution error when operator notifies runner pod #1501

@ambient-code

Description

@ambient-code

Category: bug / infrastructure

Summary

The operator fails to deliver workflow files to the runner pod due to a DNS resolution failure when looking up the runner pod's internal cluster hostname. The runner pod starts successfully (RunnerStarted: True) but the workflow is never loaded because the operator cannot reach it.

Error

Failed to notify runner: Post "http://session-dep-bump-agentic-sandbox-repo-1777982576.ols-shared.svc.cluster.local:8001/workflow": dial tcp: lookup session-dep-bump-agentic-sandbox-repo-1777982576.ols-shared.svc.cluster.local on 172.30.0.10:53: no such host

Condition State

WorkflowReconciled:
  Status: False
  Reason: UpdateFailed
  Message: Failed to notify runner: <DNS error above>

RunnerStarted: True

Steps to Reproduce

  1. Create a session with a workflow configuration
  2. Session runner pod starts successfully
  3. Operator attempts to POST workflow files to the runner at its internal cluster hostname
  4. Result: DNS lookup for the runner pod hostname fails with no such host
  5. Workflow files are never loaded into the workspace; session is broken

Expected Behavior

Operator should successfully resolve the runner pod hostname and deliver workflow files after RunnerStarted is True.

Actual Behavior

  • Runner pod is running and healthy
  • Operator cannot resolve the internal cluster DNS name for the runner pod
  • WorkflowReconciled condition stays False with UpdateFailed
  • Session is left in a broken state — workflow never loads

Notes

  • DNS server is 172.30.0.10:53 (cluster DNS)
  • The hostname pattern is session-{name}.{namespace}.svc.cluster.local — possible mismatch between the service name the operator constructs and the actual service created for the pod
  • This is a platform/infrastructure issue, not a workflow configuration issue
  • May be a timing issue (operator tries to notify before the service/DNS entry is ready) or a naming mismatch

Filed via Amber Interview

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions