feat: add REDUCE_PAGE_SIZE response action for dynamic page-size reduction#1056
Draft
Daryna Ishchenko (darynaishchenko) wants to merge 3 commits into
Draft
feat: add REDUCE_PAGE_SIZE response action for dynamic page-size reduction#1056Daryna Ishchenko (darynaishchenko) wants to merge 3 commits into
Daryna Ishchenko (darynaishchenko) wants to merge 3 commits into
Conversation
…ction Add a new ResponseAction.REDUCE_PAGE_SIZE that halves the page size on server errors (e.g. 502/504) and resets after a successful fetch. Components: - ResponseAction.REDUCE_PAGE_SIZE enum value - PageSizeReductionRequiredException raised by HttpClient - reduce_page_size()/reset_page_size() on all PaginationStrategy impls - SimpleRetriever catches the exception and delegates to the strategy - Declarative schema and Pydantic model updates Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
Contributor
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksTesting This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1782130672-reduce-page-size#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1782130672-reduce-page-sizePR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
Strategy-level tests: - reduce_page_size() halves correctly for all strategies - reduce_page_size() floors at 1 - reset_page_size() restores the configured default - reduce_page_size() is a no-op when page_size is None - Multiple successive reductions accumulate Retriever-level integration tests: - PageSizeReductionRequiredException retries with same page token - Page size is actually halved during the retry (observed via side effect) - Page size resets to default after successful fetch Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
Assert that request_params['page_size'] is 100 on the original request, 100 on the failed request, and 50 on the retry after reduction. Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new
ResponseAction.REDUCE_PAGE_SIZEthat enables declarative connectors to dynamically halve the page size when the API returns server errors (e.g. 502/504), then reset to the configured default after a successful fetch. This addresses the need from airbytehq/airbyte-internal-issues#16492 for source-github's GraphQL streams that currently implement ad-hoc page-size reduction via directstream.page_sizemutation.Signal flow:
Key difference from source-github's current approach: the CDK version actually retries the failed request with the reduced page size (by re-entering
_fetch_next_page()which rebuilds the request), whereas source-github's current implementation retries the samePreparedRequestwith the original size baked in and only benefits from the reduction on subsequent pages.Follows the existing
RESET_PAGINATION/PaginationResetRequiredExceptionpattern. All three concrete strategies (CursorPaginationStrategy,OffsetIncrement,PageIncrement) implement the newreduce_page_size()/reset_page_size()methods.Declarative YAML usage:
Test coverage: Strategy-level tests for all 3 implementations (halving, floor-at-1, reset, no-op-on-None) plus retriever-level integration tests verifying the retry uses the same page token with the reduced size and that the page size resets after success.
Requested by Daryna Ishchenko (@darynaishchenko).
Link to Devin session: https://app.devin.ai/sessions/d00a1a4ed316400cab27fd1fb87681fb