perf: O(1) sort and equality for range arrays#780
Draft
He-Pin wants to merge 2 commits intodatabricks:masterfrom
Draft
perf: O(1) sort and equality for range arrays#780He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin wants to merge 2 commits intodatabricks:masterfrom
Conversation
He-Pin
commented
Apr 15, 2026
| */ | ||
| private[sjsonnet] def rangeEquals(other: Arr): Boolean = { | ||
| isRange && other.isRange && _length == other._length && | ||
| _reversed == other._reversed && _rangeFrom == other._rangeFrom |
Contributor
Author
There was a problem hiding this comment.
would be better do with patterns after 778
Motivation: std.sort on a range array (e.g. std.range(1, 1000)) was materializing the entire range into an Eval[] array and performing O(n log n) sort on already-sorted data. Similarly, comparing two range arrays with std.assertEqual did O(n) element-by-element comparison even when both ranges describe the same integer sequence. Modification: - Val.Arr.asSortedIfKnown: returns the range as-is for forward ranges, or the O(1) reversed equivalent for reversed ranges. Returns null for non-range arrays to fall through to full sort. - Val.Arr.rangeEquals: O(1) structural equality for two ranges by comparing (rangeFrom, length, reversed) fields directly. - SetModule.sortArr: checks asSortedIfKnown before materializing. - Evaluator.equal: adds reference equality short-circuit (x eq y) and rangeEquals fast path before element-by-element comparison. Result: array_sorts benchmark improved from 8.3ms to 7.7ms (master to PR). No regressions on realistic_2 or other benchmarks.
After rebase onto master, isRange field was not defined in Arr class. Use pattern matching on RangeArr subclass instead. 🤖 Generated with [Qoder][https://qoder.com]
21c14e7 to
8fb5228
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
O(1) fast paths for sorting and comparing range arrays.
std.sort(std.range(1, N))previously materialized the full range into an array and performed O(n log n) sort on already-sorted data. Now returns immediately.std.assertEqualon two equal ranges similarly drops from O(n) element comparison to O(1).Changes
Sort fast path for range arrays —
sortArrchecksVal.Arr.asSortedIfKnown(pos)before materializing. Forward ranges return as-is (already sorted ascending); reversed ranges return the O(1) forward equivalent via the existingreversed()method. Only applies when nokeyFis provided; falls through to full sort otherwise.Range equality fast path —
Evaluator.equalfor twoVal.Arradds two short-circuits before element-by-element comparison:x eq y) — coversassertEqual(r, sort(r))where sort returns the same objectrangeEquals) — coversassertEqual(range(1,N), sort(reverse(range(1,N))))where both sides are ranges with the same(rangeFrom, length, reversed)Both optimizations use semantic methods on
Val.Arr(asSortedIfKnown,rangeEquals) that encapsulate range internals — callers never access_isRangeor_rangeFromdirectly.Benchmark Results — Scala Native vs jrsonnet (Rust, from source)
Machine: Apple Silicon, macOS. Tool:
hyperfine --warmup 5 --min-runs 30 -N.Reliable benchmarks (>20ms runtime, startup overhead not dominant)
Sub-20ms benchmarks (startup noise significant)
Algorithmic improvement
std.sort(std.range(1, N))std.sort(std.reverse(std.range(1, N)))assertEqual(range_a, range_b)same sequenceassertEqual(x, x)same objectTest plan
./mill 'sjsonnet.jvm[3.3.7]'.test— all 141 test suites pass./mill __.reformat— scalafmt passes./mill 'sjsonnet.native[3.3.7]'.nativeLink— Native binary builds