Skip to content

fix: JsonItemsParser should yield floats, not Decimals#1052

Merged
Anatolii Yatsuk (tolik0) merged 2 commits into
mainfrom
tolik0/cdk/jsonitems-use-float
Jun 12, 2026
Merged

fix: JsonItemsParser should yield floats, not Decimals#1052
Anatolii Yatsuk (tolik0) merged 2 commits into
mainfrom
tolik0/cdk/jsonitems-use-float

Conversation

@tolik0

@tolik0 Anatolii Yatsuk (tolik0) commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

What

Follow-up to #1049. JsonItemsParser calls ijson.items(...) without use_float=True, so non-integer JSON numbers are parsed as decimal.Decimal. The CDK can't serialize Decimal downstream — orjson raises TypeError: Type is not JSON serializable: decimal.Decimal — so any JsonItemsDecoder stream with decimal fields fails at serialization.

Discovered while migrating source-amazon-seller-partner's report streams: Brand Analytics (clickShare, conversionShare), Sales & Traffic, and Vendor reports all carry decimals and broke on read. The original GzipJsonDecoder used json.loads (floats), so this is a regression.

How

Pass use_float=True to ijson.items, so non-integer numbers come back as float (integers stay int), matching the json.loads/orjson behavior of JsonParser/JsonLineParser. Adds a regression test asserting floats and orjson-serializability.

Validation

Verified on a real 3.45 GB Amazon Search Terms report: clickShare/conversionShare now float (were Decimal), and orjson.dumps succeeds. 42 decoder unit tests pass.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Corrected JSON numeric parsing so non-integer numbers are produced as standard floating-point values, resolving downstream serialization errors and improving compatibility with JSON serialization libraries.
  • Tests
    • Added unit coverage to verify numeric types and serialization compatibility.

ijson.items() parses non-integer JSON numbers as decimal.Decimal by default, which the
CDK cannot serialize downstream (orjson raises 'Decimal is not JSON serializable'). This
broke any JsonItemsDecoder stream with decimal fields (e.g. Amazon Brand Analytics
clickShare/conversionShare, Sales & Traffic, Vendor reports).

Pass use_float=True so non-integer numbers are parsed as float, matching the
json.loads/orjson behavior of the other JSON parsers. Adds a regression test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 12, 2026 15:15
@github-actions

Copy link
Copy Markdown

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@tolik0/cdk/jsonitems-use-float#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch tolik0/cdk/jsonitems-use-float

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /prerelease - Triggers a prerelease publish with default arguments
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a serialization regression in the CDK’s streaming JSON path parser (JsonItemsParser) where non-integer JSON numbers were being parsed as decimal.Decimal (breaking downstream orjson serialization). It aligns JsonItemsParser numeric parsing with json.loads/orjson by enabling float parsing in ijson.

Changes:

  • Pass use_float=True to ijson.items(...) so non-integer numbers are parsed as float instead of Decimal.
  • Add a regression unit test asserting float/int types and orjson-serializability for decoded records.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py Ensures JsonItemsParser yields floats (not Decimal) for non-integer JSON numbers via use_float=True.
unit_tests/sources/declarative/decoders/test_composite_decoder.py Adds regression coverage to confirm float parsing and orjson serialization compatibility.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread unit_tests/sources/declarative/decoders/test_composite_decoder.py Outdated
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a66de275-c827-4f4d-9985-cdce6824f331

📥 Commits

Reviewing files that changed from the base of the PR and between 45aca33 and fba766b.

📒 Files selected for processing (1)
  • unit_tests/sources/declarative/decoders/test_composite_decoder.py

📝 Walkthrough

Walkthrough

This PR modifies JSON number parsing in the declarative source framework. The JsonItemsParser now explicitly uses use_float=True when calling ijson.items, ensuring non-integer JSON numbers are produced as float values rather than Decimal. A corresponding unit test validates the numeric type output and confirms records remain serializable by orjson.

Changes

JSON numeric type handling

Layer / File(s) Summary
Float-based number parsing in JsonItemsParser
airbyte_cdk/sources/declarative/decoders/composite_raw_decoder.py, unit_tests/sources/declarative/decoders/test_composite_decoder.py
JsonItemsParser.parse now calls ijson.items(..., use_float=True) to parse non-integer JSON numbers as float instead of Decimal, aligning with json.loads/orjson behavior. A new test verifies fractional numbers yield float, integer-valued fields yield int, and the resulting record is orjson-serializable.

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: fixing JsonItemsParser to yield floats instead of Decimals, which directly addresses the bug described in the PR objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch tolik0/cdk/jsonitems-use-float

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

PyTest Results (Fast)

4 118 tests  +1   4 107 ✅ +2   8m 33s ⏱️ +57s
    1 suites ±0      11 💤  - 1 
    1 files   ±0       0 ❌ ±0 

Results for commit fba766b. ± Comparison against base commit 8ee6423.

♻️ This comment has been updated with latest results.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

PyTest Results (Full)

4 121 tests  +1   4 109 ✅ +1   11m 53s ⏱️ -16s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit fba766b. ± Comparison against base commit 8ee6423.

♻️ This comment has been updated with latest results.

@tolik0

Anatolii Yatsuk (tolik0) commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

@tolik0 Anatolii Yatsuk (tolik0) merged commit 626c753 into main Jun 12, 2026
29 checks passed
@tolik0 Anatolii Yatsuk (tolik0) deleted the tolik0/cdk/jsonitems-use-float branch June 12, 2026 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants