repr: fast-path timestamp/date/time parsing#35161
Draft
def- wants to merge 3 commits intoMaterializeInc:mainfrom
Draft
repr: fast-path timestamp/date/time parsing#35161def- wants to merge 3 commits intoMaterializeInc:mainfrom
def- wants to merge 3 commits intoMaterializeInc:mainfrom
Conversation
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
a55d279 to
993ab7e
Compare
338a3f3 to
d716cb3
Compare
Add direct byte-level ISO 8601 parsers that bypass the general-purpose ParsedDateTime tokenizer+pattern-matcher for standard format inputs. The fast path eliminates VecDeque heap allocation, character-by-character tokenization, token pattern matching, and 264-byte ParsedDateTime struct initialization, replacing it all with 7 byte comparisons + 6 two-digit parses + chrono construction (~25ns total vs ~350ns). Benchmark results (per-value): - Timestamp: 319ns → 14ns (22x faster) - Timestamptz: 413ns → 23ns (18x faster) - Date: 191ns → 10ns (19x faster) - Time: 152ns → 8ns (20x faster) Batch 10k values: - Timestamps: 3.55ms → 198µs (18x faster) - Timestamptz: 4.42ms → 257µs (17x faster) - Dates: 2.06ms → 122µs (17x faster) - Times: 1.89ms → 126µs (15x faster) Non-standard formats fall back to the general parser with <1ns overhead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…utput
Replace chrono's `ts.format("%m-%d %H:%M:%S")` with direct field extraction
and stack-buffer byte writing in all four date/time formatting functions:
format_timestamp, format_timestamptz, format_date, and format_time.
The old code parsed the format string on every call, creating a DelayedFormat
struct and iterating through format directives. The new code extracts fields
directly (month(), day(), hour(), etc.) and writes them via simple arithmetic
into a stack-allocated byte buffer, emitting the result with a single
write_str() call.
Also optimizes format_nanos_to_micros to build ".NNNNNN" in a stack buffer
and trim trailing zeros, avoiding write! with dynamic width.
Benchmarks (criterion, cargo bench -p mz-repr --bench timestamp_format):
- Per-value: 10-22x faster (e.g. timestamp: 160ns → 8.5ns)
- Batch 10k timestamps: 1,680µs → 115µs (14.6x faster)
- Batch 10k timestamptz: 1,790µs → 141µs (12.7x faster)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
d716cb3 to
ba92461
Compare
ba92461 to
b6c97fe
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Taken and cleaned up from #35076