Skip to content

Prefer workers with cached actor images during scheduling #276

Description

@han2ni3bal-pixel

Description

Atelet maintains a node-local image cache, but the scheduler does not know which images are cached on each node. As a result, an actor may be assigned to a cold node and pull its images again, even when another eligible node already has them cached.

Image preparation happens synchronously during ResumeActor, so unnecessary pulls increase actor startup latency and registry traffic.

Proposal

  • Let atelet report its cached image digests to the control plane.
  • Store cache information by node.
  • After applying existing constraints, prefer available workers whose node has all required actor images cached.
  • Fall back to the current random selection when no cache hit is available.

Local snapshot availability remains a hard constraint. Image cache affinity is applied only among nodes that can restore the selected snapshot.

A general scoring framework is not required: use cache-hit preference with fallback.

Expected Outcome

Reduce unnecessary image pulls and improve Actor Resume latency and consistency.

Related Issues

This proposal complements those efforts by making worker selection aware of node-local image cache availability.

Metadata

Metadata

Assignees

No one assigned
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions