Skip to content

pgwire: Various optimizations#35168

Draft
def- wants to merge 9 commits intoMaterializeInc:mainfrom
def-:pr-pgwire-opt
Draft

pgwire: Various optimizations#35168
def- wants to merge 9 commits intoMaterializeInc:mainfrom
def-:pr-pgwire-opt

Conversation

@def-
Copy link
Contributor

@def- def- commented Feb 23, 2026

Taken and cleaned up from #35076

NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
--------------------------------------------------------------------------------------------------------------------------------------------------------
ParallelIngestion                   | wallclock       |           1.901 |           2.763 |   s    |    10%     |      no       | better: 31.2% faster

@github-actions
Copy link

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

@def- def- force-pushed the pr-pgwire-opt branch 11 times, most recently from be9acf3 to e175bc9 Compare February 23, 2026 14:55
def- and others added 9 commits February 23, 2026 16:42
…llocation

Encode pgwire DataRow messages directly from RowRef datums to BytesMut,
eliminating the intermediate Vec<Option<Value>> that values_from_row()
allocated per row. For string/bytes columns, this also eliminates the
per-value clone (to_owned/to_vec) since we now write directly from the
borrowed Datum slice.

Adds encode_data_row_direct() which handles the full DataRow wire format
(type byte, message length, field count, per-field length + text encoding)
in a single pass. A new BackendMessage::PreEncoded variant lets the
protocol layer send pre-encoded wire bytes through the codec without
re-framing.

Benchmark: 1.02-1.24x faster per row, 1.06-1.30x in batch (10k rows).
String-heavy rows see the largest improvement (1.30x) due to eliminated
string cloning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ction

Adds encode_datum_binary_direct() that encodes common datum types directly
into BytesMut for binary pgwire format, skipping the intermediate Value enum
construction. This eliminates heap allocations for string/bytes columns and
avoids ToSql trait dispatch for simple types (bool, ints, floats, timestamps,
dates, times, intervals, uuids).

Benchmark results (5-column rows):
- Integers: 137ns → 124ns (1.10x)
- Strings: 164ns → 115ns (1.43x)
- Mixed: 142ns → 115ns (1.23x)
- Timestamps: 185ns → 175ns (1.06x)
- Batch strings 10k: 1.74ms → 1.15ms (1.52x)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ue>> allocation

Replace values_from_row() in encode_copy_row_text() and encode_copy_row_csv()
with direct datum iteration using encode_datum_text_direct(). This eliminates
the per-row Vec<Option<Value>> allocation and avoids cloning string/bytes data.

Also adds an escape fast-path for COPY text: when the encoded value contains no
special characters (common for numbers, dates, timestamps), copy the entire
buffer with extend_from_slice instead of per-byte scanning.

Fixes a JSONB encoding bug in encode_datum_text_direct() where the JSONB match
arm was positioned after Datum::String/True/False/Numeric arms, causing JSONB
values to be formatted as plain scalars instead of JSON.

Benchmark: 1.64-2.17x faster COPY text encoding (integers 1.64x, strings 2.17x,
mixed 1.89x in batch).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace byte-by-byte scanning in CopyTextFormatParser::consume_raw_value()
with memchr::memchr3() which uses SSE2/AVX2 SIMD to find delimiter bytes
at 16-32 bytes per CPU cycle instead of 1 byte per 5-6 condition checks.

The old code checked is_eof(), is_end_of_copy_marker(), is_end_of_line(),
is_column_delimiter(), and peek() for EVERY byte - that's 4-6 bounds checks
and comparisons per normal character. The new code finds the next special
byte (\t, \n, \\) in a single vectorized pass, then handles it.

Benchmark: 3.5x faster for long strings (~80 chars), 1.4x for medium
strings (~17 chars), 1.1x for escaped data. Short integer values show
no improvement since per-value framework overhead dominates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… 2.1-4.6x speedup

Two optimizations for the COPY FROM text ingestion path:

1. parse_bool: Replace s.trim().to_lowercase() (heap-allocates a String
   per call) with length-dispatched ASCII case-insensitive byte matching
   via eq_ignore_ascii_case. 1.2-1.4x faster per call.

2. COPY FROM text decode: Replace Value::decode_text → into_datum →
   Vec<Datum> → Row::pack pipeline with direct decode_text_into_row into
   RowPacker. Eliminates per-row Vec<Datum> allocation, per-field Value
   construction, and per-text-column String allocation (s.to_owned()).
   2.1x faster for int cols, 4.6x faster for text cols in batch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…llocation

Encode pgwire DataRow messages directly from RowRef datums to BytesMut,
eliminating the intermediate Vec<Option<Value>> that values_from_row()
allocated per row. For string/bytes columns, this also eliminates the
per-value clone (to_owned/to_vec) since we now write directly from the
borrowed Datum slice.

Adds encode_data_row_direct() which handles the full DataRow wire format
(type byte, message length, field count, per-field length + text encoding)
in a single pass. A new BackendMessage::PreEncoded variant lets the
protocol layer send pre-encoded wire bytes through the codec without
re-framing.

Benchmark: 1.1-1.6x faster per row, 1.2-1.4x in batch (10k rows).
String-heavy rows see the largest improvement (1.6x) due to eliminated
string cloning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace `Decimal::to_standard_notation_string()` (which heap-allocates both
a Vec<u8> for coefficient digits and a String for the result) with
`write_numeric_standard_notation()` that writes directly to any fmt::Write
with zero heap allocations.

The new function extracts digits from the internal `coefficient_units()`
representation (a &[u16] slice, no allocation) into a stack-allocated
[u8; 39] buffer, builds the complete formatted output in a stack-allocated
[u8; 80] buffer, and writes it with a single write_str() call.

Updated the two main hot paths:
- `format_numeric()` in strconv.rs (pgwire text encoding)
- `Datum::Numeric` Display impl in scalar.rs

Benchmarks (criterion):
- Batch (10k values to shared buffer): 306µs → 118µs (2.6x faster)
- Per-value write-only: 1.8x-6.0x faster depending on value
- Per-value to new String: 1.0x-2.1x faster

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace `to_standard_notation_string()` (which heap-allocates Vec<u8> +
String per value) with `write_numeric_standard_notation()` that writes
directly to fmt::Write with zero heap allocations.

Uses `coefficient_units()` (&[u16] slice, no alloc) to extract digits
into a stack [u8;82] buffer, builds complete output in a stack [u8;200]
buffer, and writes with single write_str() call.

Benchmark: 2.0-4.3x faster per value, 2.3x faster in batch (10k values).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant