-
Notifications
You must be signed in to change notification settings - Fork 59
Terminator calls system_shutdown immediately when parent goes down, ignoring shutdown_timeout for placed children #86
Description
We're using FLAME to persist WebRTC call-routing processes across deployments. The idea is to run packet-routing processes on FLAME runner nodes via place_child with link: false, so that when the parent node restarts during a deploy, the runner and its placed children survive until the new parent can reconnect.
The place_child docs say:
:link– Whether the caller should be linked to the remote child process to prevent long-running orphaned resources. Defaults totrue. Set tofalseto support
long-running work that you want to complete within the:shutdown_timeoutof the remote runner, even when the parent process or node is terminated.
However, when the parent node goes down, FLAME.Terminator calls system_stop/2 immediately. It does not wait for shutdown_timeout before initiating shutdown:
Line 201 in 27b94da
| {:noreply, system_stop(state, message)} |
The shutdown_timeout seems to only apply in terminate/2 to drain in-flight RPC calls-- it doesn't delay the system_shutdown() call itself. So placed children with link: false get no grace period to finish their work; the runner starts terminating immediately.
Is this the intended behavior? We read the link: false docs as implying placed children would get up to shutdown_timeout to complete their work when the parent goes away.
Should the Terminator delay calling system_stop() to honor that, or is there a different mechanism we should be using for this use case?
Environment
- FLAME: v0.5.3
- Backend: flame_k8s_backend v0.5.7