Skip to content

SDK correctness and resilience improvements#336

Merged
PastelStorm merged 2 commits intomainfrom
evoss/sdk-correctness-and-resilience
Apr 4, 2026
Merged

SDK correctness and resilience improvements#336
PastelStorm merged 2 commits intomainfrom
evoss/sdk-correctness-and-resilience

Conversation

@PastelStorm
Copy link
Copy Markdown
Contributor

@PastelStorm PastelStorm commented Apr 4, 2026

Note

Medium Risk
Touches core split-PDF execution and retry/timeout cleanup logic; mistakes could impact partition reliability or leak resources, though changes are well-covered by expanded unit/integration tests and logging.

Overview
Improves split-PDF correctness and debuggability by adding operation-aware observability (plan/batch/chunk lifecycle logs) and propagating split metadata via X-Unstructured-Split-* headers into errors/logs.

Hardens split execution: per-operation state is isolated, transport exceptions/cancellations are handled explicitly (with optional partial-results behavior via split_pdf_allow_failed), and timeout/cleanup paths now safely cancel in-flight work even when event loops are closed.

Preserves chunk-level transport retries by deriving a split-specific retry config that always retries httpx.TransportError for chunk calls, even when SDK-level connection retries are disabled. CI/test tooling is updated (new platform integration job/target, more verbose integration output, and bumped GitHub Action versions), and the package is released as 0.43.1.

Reviewed by Cursor Bugbot for commit e65ce5b. Bugbot is set up for automated code reviews on this repo. Configure here.

@socket-security
Copy link
Copy Markdown

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit e65ce5b. Configure here.

result.index,
_create_transport_error_response(result.inner),
)
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreachable ChunkExecutionError branch in allow-failed path

Low Severity

In run_tasks, when allow_failed=True, the isinstance(result, ChunkExecutionError) check is unreachable dead code. ChunkExecutionError is only raised by _order_keeper, which wraps coroutines — but _order_keeper is exclusively used in the allow_failed=False path (via asyncio.create_task(_order_keeper(...))). In the allow_failed=True path, armed_coroutines are gathered directly without _order_keeper, so exceptions will always be raw BaseException subclasses, never ChunkExecutionError. The subsequent elif isinstance(result, BaseException) branch handles all cases correctly, making the ChunkExecutionError branch dead code that adds unnecessary complexity.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit e65ce5b. Configure here.

@PastelStorm PastelStorm merged commit 39b8263 into main Apr 4, 2026
20 checks passed
@PastelStorm PastelStorm deleted the evoss/sdk-correctness-and-resilience branch April 4, 2026 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants