-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Implement error collection for deserialization (#1196) #5364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.x
Are you sure you want to change the base?
Implement error collection for deserialization (#1196) #5364
Conversation
|
|
||
| // Create per-call attributes with the fresh bucket | ||
| ContextAttributes perCallAttrs = _config.getAttributes() | ||
| .withPerCallAttribute(tools.jackson.databind.deser.CollectingProblemHandler.class, bucket); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, avoid using fully qualified class names - use imports so the code can be kept concise
|
-1 on this PR itself due to the sheer amount of line changes. Ai code assistants should really get trained on splitting tasks to safer(or at least feeling so) bits of progression |
src/main/java/tools/jackson/databind/deser/CollectingProblemHandler.java
Outdated
Show resolved
Hide resolved
src/main/java/tools/jackson/databind/deser/CollectingProblemHandler.java
Outdated
Show resolved
Hide resolved
- Add CollectedProblem: immutable value object for error details - Add DeferredBindingException: aggregate exception for multiple errors - Add CollectingProblemHandler: stateless handler collecting recoverable errors - Add ObjectReader.collectErrors() and readValueCollecting() methods - Add comprehensive test suite (27 tests) covering all scenarios Features: - Opt-in per-call error collection (no global config) - Thread-safe with per-call bucket isolation - RFC 6901 compliant JSON Pointer paths - DoS protection with configurable limit (default 100) - Primitive vs reference type default value policy - Suppressed exception support for hard failures Tests verify: - Per-call bucket isolation (concurrent + successive) - JSON Pointer escaping (tilde, slash, combined) - Limit reached behavior - Unknown property handling - Default value policy - Message formatting - Edge cases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Address review feedback from specs/issue-1196-collecting-deserialization-errors-claude-review-gpt-5.md: API Surface & Delegation: - Fix ObjectReader.readValueCollecting() to use public API `this.with(perCallAttrs).readValue(p)` instead of protected `_new(...)` factory - Maintains consistency with Jackson's builder pattern and public surface Limit Resolution: - Read max-problem cap from per-call reader config instead of base _config - Properly honors per-call attribute overrides - Affects both normal completion and hard failure paths Javadoc Enhancements: - Add comprehensive class-level Javadoc to DeferredBindingException with usage examples - Enhance CollectedProblem Javadoc explaining all fields, truncation, and immutability - Expand CollectingProblemHandler Javadoc detailing design, recoverable errors, and DoS protection - Improve ObjectReader.readValueCollecting() Javadoc noting behavior without collectErrors() and parser filtering differences Testing: - All 27 CollectingErrorsTest tests pass - Full suite: 4,662 tests pass, 0 failures, 0 errors - No regressions introduced 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add helper method overloads for all input types (File, InputStream, Reader) to eliminate inline catchThrowableOfType duplication - Extract buildInvalidOrderJson() helper to avoid duplicated JSON building logic across 4 tests - Use StandardCharsets.UTF_8 instead of checked exception "UTF-8" string - Improve executor shutdown safety with shutdownNow() fallback and proper InterruptedException handling - Enhance concurrent test to verify exact unique error values, catching any bucket-sharing regressions - Use Files.deleteIfExists() for robust temp file cleanup - Streamline hard failure test to focus on suppressed exception mechanics - Replace hand-rolled try/catch blocks with expectDeferredBinding() helper in 4 additional tests All 31 tests pass. Code is more maintainable with consistent patterns and reduced duplication. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add release notes entry in VERSION for 3.1.0 - Add tutorial section in README.md matching existing tone - Explain problem: fix-retry cycle vs collecting all errors at once - Show basic usage with code examples - Document DoS protection with 100 error default limit - Clarify best-effort nature and what can/cannot be collected - Document JSON Pointer paths (RFC 6901) for error locations - Explain thread-safe per-call error buckets - List practical use cases: API validation, batch processing, tooling Addresses 9-year-old feature request from Oliver Drotbohm covering key concerns from discussion: DoS protection, thread safety, recoverable errors, and error reporting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…ception handling - Replace deprecated fail() calls with AssertJ assertions - Fix deprecated catchThrowableOfType() parameter order (Class, lambda) - Remove unnecessary throws Exception declarations from 27 test methods - Keep throws Exception only for tests using try-with-resources All 31 tests pass successfully. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Restore src/test/java/module-info.java that was accidentally removed in commit b7fbb39. This file is required for the test module system configuration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…umentation This commit addresses all technical feedback from reviewers @pjfanning, @cowtowncoder, and @JooHyukKim on PR FasterXML#5364. API Changes: - Rename collectErrors() → problemCollectingReader() - Rename readValueCollecting() → readValueCollectingProblems() - Rationale: "Error" conflicts with Java's Error class (OutOfMemoryError). "Problem" aligns with existing DeserializationProblemHandler terminology. Code Quality: - Add proper imports for CollectingProblemHandler, CollectedProblem, DeferredBindingException - Replace 6 fully qualified class names with short names - Add missing @throws javadoc tags Remove Code Duplication: - Refactor buildJsonPointer() to use jackson-core's JsonPointer API (appendProperty/appendIndex methods) - Delete custom escapeJsonPointerSegment() method (~8 lines) - Leverage tested jackson-core RFC 6901 escaping implementation Enhanced Documentation: - Add architecture rationale to CollectingProblemHandler explaining why context attributes are used (thread-safety, call isolation, immutability, config vs state separation) - Add exception handling strategy to readValueCollectingProblems() explaining why only DatabindException is caught (not all JacksonExceptions - streaming errors should fail fast) - Add handler replacement warning to problemCollectingReader() - Update DeferredBindingException javadoc with new API names Files changed: 5 - README.md (tutorial examples) - ObjectReader.java (main API) - CollectingProblemHandler.java (implementation) - DeferredBindingException.java (exception javadoc) - CollectingErrorsTest.java (test updates) All tests pass. No behavioral changes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
200557a to
f0c4342
Compare
|
Thanks for the feedback.
If it's some consolation, the actual logic change in ObjectReader and CollectingProblemHandler is only like 1/3 of the total changes. Others are from domain classes, documentation and tests. As this is not a small change (atleast from my POV) this seemed reasonable for me.
Reg Coding Assistants and splitting tasks, this is indeed true. Once their context window starts filling up their answers are not that reliable. Let me explain my process how I overcome this (something I also do in my job).
If you see any code blocks that could be improved/simplified or if you have any feedback to my development process (mentioned above), I would be happy to learn and improve on it in future. |
|
@cowtowncoder Just wanted to check in as the PR hasn't been reviewed yet. |
|
@sri-adarsh-kumar I haven't had time to re-review, but not waiting for anything from your side. Biggest question is that of how API should work, functionality be exposed. |
src/main/java/tools/jackson/databind/deser/CollectingProblemHandler.java
Outdated
Show resolved
Hide resolved
src/main/java/tools/jackson/databind/deser/CollectingProblemHandler.java
Outdated
Show resolved
Hide resolved
src/main/java/tools/jackson/databind/deser/CollectingProblemHandler.java
Outdated
Show resolved
Hide resolved
| return true; // Problem handled | ||
| } | ||
|
|
||
| return false; // Limit reached, let default handling throw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not instead throw "full" DeferredBindingException? To avoid losing 100 (etc) collected problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how it should be. This is also verified under CollectingErrorsTest#defaultLimit() test
src/main/java/tools/jackson/databind/deser/CollectingProblemHandler.java
Show resolved
Hide resolved
src/main/java/tools/jackson/databind/deser/CollectingProblemHandler.java
Outdated
Show resolved
Hide resolved
| List<CollectedProblem> bucket = new ArrayList<>(); | ||
|
|
||
| // Create per-call attributes with the fresh bucket | ||
| ContextAttributes perCallAttrs = _config.getAttributes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this, let's make use of DeserializationContext.setAttribute() -- DeserializationContext is per-call, mutable.
And to have access to the DC, need to refactor to have something like
protected <T> T readValueCollectingProblems(JsonParser p, DeserializationContextExt ctxt)
that all public readValueCollectingProblems() call with parser, context created.
Finally, we should NOT call readValue(parser) because that would just create yet another DeserializationContext. This means it'll be necessary to call one _bind() or _bindAndClose().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored readValueCollectingProblems so every overload shares a single DeserializationContextExt, stores the bucket via ctxt.setAttribute(...), and calls _bind/_collectingBindAndClose directly without creating new ObjectReader instances.
…i-adarsh-kumar/jackson-databind into issue-1196-error-collection-attempt-1
src/main/java/tools/jackson/databind/deser/CollectingProblemHandler.java
Outdated
Show resolved
Hide resolved
|
Ok. Phew! Had another pass and now I understand this much better. I like the fact this one of remaining "Most Wanted" issues. There are a few things to address so added quite a few notes. But we are getting there. :) (and once again, I am EXCITED about solving this long-standing HIGHLY VOTED FOR feature request!) |
…factor context handling, leverage existing utilities Addresses reviewer feedback by moving max-problem configuration into CollectingProblemHandler constructor, refactoring ObjectReader to use single DeserializationContextExt per call with direct _bind() invocation (eliminating temporary ObjectReader allocation), replacing manual JSON Pointer construction with TokenStreamContext.pathAsPointer(), using ClassUtil.getClassDescription() for consistent class name formatting, removing defensive exception catching, and fixing array null handling to allow null values in arrays of reference types. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…i-adarsh-kumar/jackson-databind into issue-1196-error-collection-attempt-1
When the problem collection limit is reached during deserialization, throw DeferredBindingException as the primary exception with the original DatabindException as suppressed, ensuring API contract consistency and correct exception handling for callers. Also inline canReturnNullFor method to simplify code by removing unnecessary abstraction. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
Correct README to clarify that when problems are collected, the DeferredBindingException contains only the problems, not the partial result object. The partial result is discarded when problems exist. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
…ndAndClose parity Introduced two internal helpers (_collectingBind and _collectingBindAndClose) to match the parser lifecycle semantics of existing _bind/_bindAndClose methods. All convenience overloads (String, byte[], File, InputStream, Reader) now properly close parsers via _collectingBindAndClose, establishing parity with regular readValue methods. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
PR: Implement error collection for deserialization (#1196)
AI Disclosure
This is an AI-generated solution. The requirements were specified by a human developer based on a 9-year-old feature request from @odrotbohm, and the implementation was fully built by Claude Code (an AI coding agent).
Overview
Adds opt-in error collection capability to Jackson deserialization, allowing applications to collect multiple deserialization errors instead of failing fast on the first error. This addresses issue #1196 requested by Oliver Drotbohm in 2016.
Target branch:
3.x(for Jackson 3.1.0)Motivation
Currently, Jackson fails fast on the first deserialization error, requiring a fix-retry cycle to discover all problems in a payload. This is inefficient for:
Key Features
1. Opt-in Per-Call Design
collectErrors()onObjectReader2. DoS Protection
3. Best-Effort Collection
4. JSON Pointer Paths (RFC 6901)
/items/0/price~→~0,/→~15. Thread Safety
ObjectReaderacross threadsAPI Usage
Basic Usage
Custom Error Limit
Multiple Input Types
Implementation Details
New Classes
1.
CollectedProblemImmutable value object representing a single deserialization error:
path: JSON Pointer to problematic field (e.g.,/items/0/price)message: Human-readable error messagerawValue: The invalid input value that caused the errortargetType: Expected Java type2.
DeferredBindingExceptionAggregate exception thrown when errors are collected:
problems: List of all collected errorspartialResult: The partially deserialized object (with default values for failed fields)limitReached: Flag indicating if error limit was hitDatabindExceptionfor compatibility3.
CollectingProblemHandlerStateless problem handler that collects errors during deserialization:
DeserializationProblemHandlerinterfaceDeserializationContext4.
ObjectReaderExtensionsNew methods:
collectErrors(): Enable collection with default limit (100)collectErrors(int maxProblems): Enable collection with custom limitreadValueCollecting(...): Read with error collection (5 overloads for different input types)Design Decisions
readValueCollecting()call gets a fresh error list stored inDeserializationContextattributesTesting
Comprehensive test suite with 31 tests covering:
Core Functionality
Edge Cases
Backward Compatibility
DatabindExceptionPerformance Considerations
Use Cases
API Validation
Batch Processing
Limitations
Best-effort: Not all errors can be collected
Default values: Failed fields get defaults
Error limit: Default 100 errors
Future Enhancements
Potential future improvements (not in this PR):
Generated by: Claude Code (Anthropic)
Specified by: @sri-adarsh-kumar
Original feature request: @odrotbohm (2016)