Skip to content

perf: cache sorted visible key names on Val.Obj#784

Merged
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:perf/sorted-keys-cache
Apr 24, 2026
Merged

perf: cache sorted visible key names on Val.Obj#784
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:perf/sorted-keys-cache

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 18, 2026

Summary

  • Cache the result of visibleKeyNames.sorted(Util.CodepointStringOrdering) on Val.Obj to avoid re-sorting on every materialization
  • Non-inline objects (>8 fields, super chain, excludedKeys) previously called .sorted() every time they were materialized via std.manifestJson or the default output path
  • The cached field uses @volatile for safe publication (benign race: worst case is a redundant sort)

Changes

  • Val.scala: Add _sortedVisibleKeyNames field and sortedVisibleKeyNames accessor method
  • Materializer.scala: Use obj.sortedVisibleKeyNames instead of obj.visibleKeyNames.sorted(...) in both visitor-based and fused materializer paths
  • ByteRenderer.scala: Use obj.sortedVisibleKeyNames in fused materializer generic object path

Test plan

  • ./mill 'sjsonnet.jvm[3.3.7]'.test — all tests pass
  • ./mill 'sjsonnet.jvm[3.3.7]'.compile — compiles clean
  • Objects with canDirectIterate already use _sortedInlineOrder cache; this extends caching to the generic (non-inline) path

Comment thread sjsonnet/src/sjsonnet/Val.scala Outdated
Motivation:
Non-inline objects (>8 fields, super chains, excludedKeys) call
.sorted(CodepointStringOrdering) on visibleKeyNames at every
materialization. This re-sorts and allocates a new array each time.

Modification:
Add _sortedVisibleKeyNames cache on Val.Obj with lazy initialization.
Replace all 5 call sites (Materializer recursive/stackless, Renderer,
ByteRenderer, Val.Obj.foreachElement) with the cached accessor.

Result:
Objects with stable schemas sort their key names once and reuse the
result on subsequent materializations.
@He-Pin He-Pin force-pushed the perf/sorted-keys-cache branch from d732290 to 5ccab84 Compare April 21, 2026 19:02
@He-Pin He-Pin requested a review from stephenamar-db April 21, 2026 19:02
@stephenamar-db stephenamar-db merged commit fd1b3b1 into databricks:master Apr 24, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants