Skip to content

feat(action-log): Outbox-based GroupActionLogEntry management#117836

Draft
kcons wants to merge 4 commits into
kcons/derivedfrom
kcons/derived-outbox
Draft

feat(action-log): Outbox-based GroupActionLogEntry management#117836
kcons wants to merge 4 commits into
kcons/derivedfrom
kcons/derived-outbox

Conversation

@kcons

@kcons kcons commented Jun 16, 2026

Copy link
Copy Markdown
Member

Use standard outbox infrastructure.
Better description TODO.

@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Jun 16, 2026
category=OutboxCategory.GROUP_ACTION_LOG_EVENT,
# Could use a random int to avoid the DB round-trip at the cost of
# negligible (~1/2^63) collision risk within a shard.
object_identifier=CellOutbox.next_object_identifier(),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always find this pattern a little funny. Rather than adding a new incrementing counter to postgres for this, we just hop on to the primary key instead 😆

# (process_shard → transaction.atomic), so the GALE is not yet committed.
# Defer to on_commit so the GALE is visible to readers on other connections.
using = router.db_for_write(GroupActionLogEntry)
if p["force_async_derived"]:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting, is this just to prevent group log task processing or task queueing failures from tanking the outbox transaction?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the on_commit is, the choice between async or not is to show we can accomodate different latency/consistency tolerance trade-offs. It'd be too specific for what we have currently, but we may also want to do "don't let derived failures escape, if we have a db issue, schedule a task so we can move on and be eventually consistent".

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's definitely a concern. My thought is just have this enqueue the task and ignore the synchronous form of processing unless we have a really compelling use case for this, but this seems fine.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just failure isolation, it's also that GALE is expected to be on a different db from derived data, so pre-commit it's not really canonical. I'll add a comment here clarifying the justification.
Also, technically, scheduling task pre-commit could mean we run the task before the new entry is visible, which is broken, but our task system isn't low latency enough in prod for that to be a real risk.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem with task only is that we don't get the "mutating request gets mutated response" property without additional wiring. If we can get that by default (and we should able to), that's nice. If not, maybe we go default async and have "please make sure we're up-to-date" calls and/or "ensure derived data reflects changes in this block" contexts.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So if that's the case, couldn't we hoist this task creation or synchronous evaluation logic up out of receiver entirely? Like it could live issue/action_log/base.py and provide the same guarantee, unless I'm missing something that on_transaction is doing differently.

Either the outbox transaction and flush succeeded, which means we've exited the outbox_context decorator already, or one of the two failed, and we can't guarantee anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants