Skip to content

Conversation

@fmterrorf
Copy link

In conjuction commanded/commanded#569 adds support for batching Ecto projections.

nikkocampbell and others added 6 commits September 18, 2024 15:43
Added tests and relevant fixes to batch projector

added after update callback test

Update dependencies

docs

temporary commanded version

Skip over partial seen batch

remove elixir_uuid after rebase

Return error on partially seen batch again
@fmterrorf fmterrorf marked this pull request as ready for review September 20, 2024 21:42
@cdegroot
Copy link

Note: review can wait until Calmwave has dogfooded this for a bit.

Copy link
Contributor

@drteeth drteeth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this feature. It looks great, well done.

As this PR relies on upstream changes to Commanded, it will have to wait until that work is merged. See commanded/commanded#569

Before calling the PR done, the reference to the calmwave fork would need to be dropped of course.

@anderslemke
Copy link

I'm curious what the status on this is. What are the experience in Calmwave, @cdegroot ?

Also, I have a question:

Let's say I set batch size to 100.
Now, let's assume that the projection is up to date, and a new event is added.
Will I get the call to project_batch right away, or will it wait until we have another 99 events ready for the batch?

@yordis
Copy link
Contributor

yordis commented Nov 18, 2025

@anderslemke The subscription flushes on timeout (milliseconds) if fewer events are available, so batch_size: 100 won't wait for 99 more events.

Copy link
Contributor

@yordis yordis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed a few issues with partial batch handling:

  1. No event filtering - When a batch contains already-seen events (e.g., events [2,3,4] when watermark is 2), the user projection receives all events instead of just [3,4]. This can cause duplicate projections.

  2. No locking - The watermark check and update aren't atomic. Two concurrent batches could both pass the check and cause race conditions.

  3. Test verification - The test at lines 74-91 should verify that "e4" was actually projected, not just that the call succeeded.

The core issue is that apply(multi_fn, [multi]) at line 152 passes the multi but not the filtered events, so the user's lambda processes the original batch from closure scope.

A robust approach would be: lock → filter unseen events → update watermark → pass only unseen events to user projection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants