Conversation
As opposed to a recursive loop. We do this by keeping a stack of frames (very similar to how the stack of values was already stored). Each frame represents the state of a container. Since there are only 2 in JSON, it doesn't have to get too complex.
Each frame in the iterative parser now holds an enum describing its "phase", in order to support suspending parsing.
Yep. That's why I toyed with this idea. Your implementation is extremely close to what I had in mind. Looks pretty good at first glance, but I'll need to find some block of time to review this carefully. Thanks a lot! |
|
Great! Could you share benchmark results with/without this? We can use https://github.com/ruby/json/tree/master/benchmark . |
There was a problem hiding this comment.
Pull request overview
This PR refactors the C extension JSON parser from a recursive-descent implementation to an iterative state machine driven by an explicit “frame stack” of container parse states. This is foundational work toward streaming-style parsing and also avoids C call-stack exhaustion when parsing very deeply nested JSON with max_nesting disabled.
Changes:
- Introduces a new
json_frame_stack(with spill-to-heap and Ruby TypedData lifecycle management) to track container parsing state without recursion. - Replaces the recursive
json_parse_anyimplementation with an iterative loop using per-frame “phases” (VALUE/KEY/COLON/COMMA/DONE) to drive parsing. - Wires the frame stack into
JSON_ParserStateand initializes/cleans it up incParser_parse.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // resume purely from the frame stack. A JSON_FRAME_ROOT frame sits at the | ||
| // bottom of the stack, so the stack is never empty mid-parse and the document | ||
| // itself is just another frame whose value, once parsed, leaves its phase DONE. | ||
| static VALUE json_parse_any(JSON_ParserState *state, JSON_ParserConfig *config) |
|
At least on my machine, there is a 5-15% regression on most benchmarks. But we might be able to reclaim some of that. |
|
By introducing just a couple "computed gotos," the one benchmark that got hit the most can be made 15% faster: 3f15d4d And it's now on par with master: And that's just a quick hack, I couldn't add a computed goto for the main issue in that benchmark which is I think if we split the This also makes |
JSON_PHASE_ARRAY_COMMA and JSON_PHASE_OBJECT_COMMA Allows to remove one conditional.
Saves having to go through the dispatch loop again.
Take less arguments so it's easier to read.
|
After the last change, most benchmarks are now on par if not a little bit faster. A few are still a little bit slower but I'll see what I can do. Also There's always a bit of variance, so 1.04x might not be very significative: |
|
I just realized I made a mistake when benchmarking the initial version, the regression was 5-15% across the board. I updated my previous comment. |
Switch the recursive descent to an iterative algorithm by keeping a stack of "frames". Each frame represents parsing a container (the points at which you would recurse). Within frames there are "phases"; each phase maps to effectively a token that its looking for next.
This does not get all of the way to streaming parsing (that would require partial-token support), but it gets most of the way there, and is all required pre-work.
On my machine there's no noticeable slow-down. (There's also no noticeable speed-up.) The only really noticeable side-effect is that if you pass
max_nesting: false, this will not crash anymore from running out of stack space.