feat(taskbroker): Add Sent Flag to Prevent Dropping Tasks on Push Failure by george-sentry · Pull Request #586 · getsentry/taskbroker

george-sentry · 2026-04-02T22:03:01Z

Linear

Description

Currently, taskworkers pull tasks from taskbrokers via RPC. This approach works, but has some drawbacks. Therefore, we want taskbrokers to push tasks to taskworkers instead. Read this page on Notion for more information.

Right now, I rely on processing_deadline to revert processing tasks back to pending if pushing them failed. This isn't good because it eats through processing attempts, resulting in needlessly dropped tasks.

I want to add a sent column to the activation table to track whether a task was successfully sent after being fetched from the table. Now, upkeep increments processing attempts only for tasks that are processing and have sent = true.

If the status is processing and sent = false, that means pushing failed or timed out (or didn't happen yet), and we can revert back to pending without incrementing processing attempts.

linear-code · 2026-04-02T22:03:04Z

STREAM-860 Add Sent Flag to Handle Push Failures

src/upkeep.rs

src/push/mod.rs

benches/store_bench.rs

src/grpc/server.rs

pg_migrations/0001_create_inflight_activations.sql

src/store/inflight_activation.rs

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

src/store/inflight_activation.rs

sentry · 2026-04-03T20:42:52Z

src/store/inflight_activation.rs

+        let mut rows = self
+            .claim_activations(application, namespaces.as_deref(), Some(1), None, true)
+            .await?;


Bug: A race condition in pull mode marks tasks as sent=true before delivery. A network failure during response transmission causes retry attempts to be consumed for undelivered tasks.
_{Severity: HIGH}

Suggested Fix

The sent flag should only be marked as true after the gRPC response has been successfully delivered to the worker, similar to the implementation in push mode. This involves moving the logic that sets sent=true to after the gRPC call returns successfully, ensuring the database state reflects the actual delivery status.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/store/inflight_activation.rs#L456-L458 Potential issue: In pull mode, the `sent` flag for a task is set to `true` in the database before the task data is successfully transmitted to the worker. If a network failure occurs during the gRPC response transmission after the database write, the task is marked as `sent` but was never delivered. When the processing deadline for this task expires, the `handle_processing_deadline()` function will incorrectly increment the `processing_attempts` counter because it treats the task as successfully sent. This consumes a retry attempt for a task that never reached a worker, potentially causing it to be dropped prematurely.

sentry · 2026-04-03T20:42:52Z

src/upkeep.rs

        if let Ok(tasks) = store
-            .get_pending_activations_from_namespaces(None, Some(&demoted_namespaces), None, None)
+            .claim_activations(None, Some(&demoted_namespaces), None, None, false)
            .await


Bug: Demoted namespace tasks with persistent Kafka publish failures enter an infinite retry loop, as their attempt counters are never incremented upon processing deadline expiration.
_{Severity: HIGH}

Suggested Fix

Modify the logic for handling demoted namespace tasks to ensure that persistent failures consume retry attempts. This could involve either marking the task as sent=true before the Kafka publish attempt or introducing a separate mechanism to increment the attempt counter for this specific failure scenario, preventing the infinite loop.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/upkeep.rs#L301-L303 Potential issue: When handling demoted namespace tasks, the system claims them with `sent=false` and attempts to publish them to Kafka. If the Kafka publish fails, the task remains in a `processing` state with `sent=false`. When its processing deadline expires, the task is reverted to `pending` without incrementing its `processing_attempts` counter. This creates an infinite loop where a task with a persistent Kafka publish failure will be repeatedly claimed and reverted without ever consuming its retry budget, leading to wasted system resources.

Add Sent Flag to Prevent Dropping Tasks on Push Failure

358edc1

george-sentry requested a review from a team as a code owner April 2, 2026 22:03

sentry bot reviewed Apr 2, 2026

View reviewed changes

src/upkeep.rs Show resolved Hide resolved

src/push/mod.rs Show resolved Hide resolved

cursor bot reviewed Apr 2, 2026

View reviewed changes

benches/store_bench.rs Outdated Show resolved Hide resolved

Add Metrics for Processing Deadline Resets, Fix AI Reviewer Bugs

7084a24

sentry bot reviewed Apr 3, 2026

View reviewed changes

src/grpc/server.rs Show resolved Hide resolved

cursor bot reviewed Apr 3, 2026

View reviewed changes

pg_migrations/0001_create_inflight_activations.sql Outdated Show resolved Hide resolved

Split Postgres Changes into Migrations

1d248a1

sentry bot reviewed Apr 3, 2026

View reviewed changes

src/store/inflight_activation.rs Outdated Show resolved Hide resolved

cursor bot reviewed Apr 3, 2026

View reviewed changes

src/store/inflight_activation.rs Outdated Show resolved Hide resolved

Handle Claim One Invariant Gracefully

688dc04

sentry bot reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(taskbroker): Add Sent Flag to Prevent Dropping Tasks on Push Failure#586

feat(taskbroker): Add Sent Flag to Prevent Dropping Tasks on Push Failure#586
george-sentry wants to merge 4 commits intomainfrom
george/push-taskbroker/add-sent-flag

george-sentry commented Apr 2, 2026

Uh oh!

linear-code bot commented Apr 2, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

sentry bot Apr 3, 2026

Uh oh!

sentry bot Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

george-sentry commented Apr 2, 2026

Linear

Description

Uh oh!

linear-code bot commented Apr 2, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sentry bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

sentry bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant