-
-
Notifications
You must be signed in to change notification settings - Fork 144
7: Convert AST nodes to frozen dataclasses (70% faster decode, 40% faster parsing) #256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
7: Convert AST nodes to frozen dataclasses (70% faster decode, 40% faster parsing) #256
Conversation
Introduce benchmarks using a large (~117KB) GraphQL query to measure parse and pickle serialization performance. These provide a baseline for comparing serialization approaches in subsequent commits. Baseline performance (measrued on a Macbook Pro M4 Max): - Parse: 81ms - Pickle encode: 24ms - Pickle decode: 42ms - Roundtrip: 71ms
Prepares AST for immutability by using tuples instead of lists for collection fields. This aligns with the JavaScript GraphQL library which uses readonly arrays, and enables future frozen datastructures.
Python 3.9 reaches end-of-life October 2025. Python 3.10 adoption is now mainstream - major frameworks (strawberry, Django 5.0, FastAPI) require it. This enables modern Python features: - Dataclasses with `kw_only` - Union types with `|` syntax (PEP 604) - isinstance() with union types directly - match statements for pattern matching
Modifies the AST visitor to use copy-on-write semantics when applying edits. Instead of mutating nodes in place, the visitor now creates new node instances with the edited values. This prepares for frozen AST nodes while maintaining backwards compatibility. The visitor accumulates edits and applies them by constructing new nodes, enabling the transition to immutable data structures.
- Update test_visitor.py to properly type-annotate the visitor class attribute and add assertion before using selection_set - Update test_schema_parser.py to use more precise types that match GraphQL spec: - NonNullTypeNode's inner type can only be NamedTypeNode or ListTypeNode - Schema definitions use ConstDirectiveNode, not DirectiveNode - Default values use ConstValueNode, not ValueNode - OperationTypeDefinition's type_ must be NamedTypeNode
- Handle token.value being str | None by using `or ""` fallback - Fix parse_variable_definition to not use `and` for side effects - Use properly typed variable in parse_nullability_assertion - These fixes prepare for stricter type checking in frozen dataclasses
…r parsing) Refactor all of the GraphQL AST Nodes to use Python dataclasses to provide better type safety, immutability guarantees, and cleaner code while maintaining backwards compatibility with existing APIs. Benchmark comparison (837f604 base vs dataclasses): | Benchmark | Base | Dataclass | Change | |---------------------------------|--------|------------|-----------------| | test_parse_large_query | 33,108 | 18,689 | 44% faster | | test_parse_kitchen_sink | 577 | 361 | 37% faster | | test_pickle_large_query_decode | 18,520 | 5,549 | 70% faster (3x) | | test_pickle_large_query_encode | 9,038 | 4,117 | 54% faster (2x) | | test_pickle_large_query_round | 28,048 | 10,206 | 64% faster (3x) | | test_many_repeated_fields | 15,918 | 14,909 | 6% faster | | test_execute_basic_sync | 310 | 292 | 6% faster | | test_execute_basic_async | 354 | 338 | 5% faster |
CodSpeed Performance ReportMerging this PR will degrade performance by 23.43%Comparing Summary
Performance Changes
|
|
Baseline performance (before PR #250):
Performance (after PR #256):
(mean benchmark time in ms) |
|
Thanks @corydolphin . These are in fact impressive gains in parsing and deser of large queries. Codspeed complains about a degredation in the "validate introspection query" benchmark. I think this is just because of the high variability in this test. Can you double check that it's not caused by this PR? Generally, I think we should split some of the benchmarks and have created issue #257 to remember. I will already merge and fix the linting error later. |
Refactor all of the GraphQL AST Nodes to use Python dataclasses to provide
better type safety, immutability guarantees, and cleaner code while maintaining
backwards compatibility with existing APIs.
Benchmark comparison (837f604 base vs dataclasses):