Skip to content

Terminator calls system_shutdown immediately when parent goes down, ignoring shutdown_timeout for placed children #86

@samrat

Description

@samrat

We're using FLAME to persist WebRTC call-routing processes across deployments. The idea is to run packet-routing processes on FLAME runner nodes via place_child with link: false, so that when the parent node restarts during a deploy, the runner and its placed children survive until the new parent can reconnect.

The place_child docs say:

:link – Whether the caller should be linked to the remote child process to prevent long-running orphaned resources. Defaults to true. Set to false to support
long-running work that you want to complete within the :shutdown_timeout of the remote runner, even when the parent process or node is terminated.

However, when the parent node goes down, FLAME.Terminator calls system_stop/2 immediately. It does not wait for shutdown_timeout before initiating shutdown:

{:noreply, system_stop(state, message)}

The shutdown_timeout seems to only apply in terminate/2 to drain in-flight RPC calls-- it doesn't delay the system_shutdown() call itself. So placed children with link: false get no grace period to finish their work; the runner starts terminating immediately.

Is this the intended behavior? We read the link: false docs as implying placed children would get up to shutdown_timeout to complete their work when the parent goes away.

Should the Terminator delay calling system_stop() to honor that, or is there a different mechanism we should be using for this use case?

Environment

  • FLAME: v0.5.3
  • Backend: flame_k8s_backend v0.5.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions