Worker: memory management improvements by josephjclark · Pull Request #1366 · OpenFn/kit

josephjclark · 2026-04-13T15:50:35Z

Responding to a spate of OOMkills and lost runs: some experiments with memory management

Fixes #897

AI Usage

Please disclose whether you've used AI anywhere in this PR (it's cool, we just
want to know!):

I have used Claude Code
I have used another model
I have not used AI

You can read more details in our
Responsible AI Policy

josephjclark · 2026-04-14T14:06:38Z

I don't understand why this is failing in tests 🤔

Running around in circles. Even in code where the json streaming function isn't called it seems to fail.

I've also bumped into a possible issue with circular structures. It makes me nervous - shouldn't the JSON.stringify function throw if it hits a circular structure? That's something that could be hurting us right now in production

EDIT: Well we do seem to handle circular structures - the runtime must be dealing with that before events are emitted

{
  "x": {
    "x": "[Circular]"
  },
  "data": {}
}

josephjclark · 2026-04-14T15:03:04Z

managed to fix one of the failing tests but still got fails. Will have to look into this later. Claude unable to help.

josephjclark · 2026-04-24T14:08:35Z

So nervous of this PR. It's a tiny change really. Works just great locally. But my tests go beserk - random tests seem to fail to return basic events, for no good reason that I can see. Very hard to debug this stuff because it's cross-process. Not sure.

josephjclark · 2026-04-24T15:35:03Z

If I stare at this any longer I'll go actually mad.

But I might have figured it out.

It's just the asynchronicity in the logging.

If you make the stringify algo async, then it'll exhibit exactly the same behaviour and events will stop being emitted. I thought it was a secret exception in the async iterator.

But no - what I think is happening is that the workflow is finishing while events are still being stringified to be emitted by the child process. But the child process gets killed as soon as the run is over.

We need some tracker somewhere which makes sure that all pending events are properly set.

This behaviour does make me wonder if we have an async hole in prod which is the underlying cause of some rare exits. Could the worker thread be getting closed before messages have been sent?

EDIT: not 100% sold on this theory (just allowing the worker more time beefore closing doesn't help) but I'm sure it's something around the async message processing for sure. This may also cause events to exit the worker out of sequence, so maybe we need a queue here too, just like in the main engine (same queue?)

taylordowns2000 added this to Core Apr 13, 2026

github-project-automation Bot moved this to New Issues in Core Apr 13, 2026

josephjclark added 3 commits April 24, 2026 12:05

use streaming algorithm to calculate payload size

a4e3f1d

skip stream size calculation for problematic data types

cd46e26

logger: smal fix

d0aa6e7

josephjclark force-pushed the worker-memory branch from 4f80206 to d0aa6e7 Compare April 24, 2026 11:08

josephjclark changed the base branch from main to release/next April 24, 2026 12:35

attempting to get tests to pass

cf126cf

Base automatically changed from release/next to main April 27, 2026 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Worker: memory management improvements#1366

Worker: memory management improvements#1366
josephjclark wants to merge 4 commits intomainfrom
worker-memory

josephjclark commented Apr 13, 2026

Uh oh!

josephjclark commented Apr 14, 2026 •

edited

Loading

Uh oh!

josephjclark commented Apr 14, 2026

Uh oh!

josephjclark commented Apr 24, 2026

Uh oh!

josephjclark commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

josephjclark commented Apr 13, 2026

AI Usage

Uh oh!

josephjclark commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josephjclark commented Apr 14, 2026

Uh oh!

josephjclark commented Apr 24, 2026

Uh oh!

josephjclark commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

josephjclark commented Apr 14, 2026 •

edited

Loading

josephjclark commented Apr 24, 2026 •

edited

Loading