I was benchmarking base64 performance on large payloads and noticed sjsonnet is significantly slower than jrsonnet — about 6x on a ~4.5MB string with a couple of encode/decode roundtrips.
Dug into it a bit. The bottleneck isn't the base64 codec itself — it's the UTF-16 ↔ UTF-8 conversion that happens on every call. Since Java/Scala strings are UTF-16 internally, every std.base64(str) has to do str.getBytes("UTF-8") to get bytes for the encoder, and every std.base64Decode has to do new String(bytes, "UTF-8") to produce the result. That's two full copies of the data per operation, going through the charset encoder/decoder.
jrsonnet doesn't have this problem because its strings are UTF-8 natively (custom IStr type backed by [u8]), so base64 can work directly on the string bytes with zero conversion.
For small payloads (a few KB) this doesn't really matter — interpreter overhead dominates. But once you get into the hundreds-of-KB or MB range, the conversion cost adds up fast.
Repro (requires hyperfine + both tools installed):
// base64_ultra.jsonnet
local s1 = std.repeat("The quick brown fox jumps over the lazy dog. ", 100000);
local e1 = std.base64(s1);
local d1 = std.base64Decode(e1);
local e2 = std.base64(d1);
local d2 = std.base64Decode(e2);
{
input_len: std.length(s1),
encoded_len: std.length(e1),
roundtrip_ok: d2 == s1,
}
hyperfine --warmup 2 \
'sjsonnet base64_ultra.jsonnet' \
'jrsonnet base64_ultra.jsonnet'
On my M4 Max (Scala Native build):
- sjsonnet: ~88ms
- jrsonnet: ~14ms
I was benchmarking base64 performance on large payloads and noticed sjsonnet is significantly slower than jrsonnet — about 6x on a ~4.5MB string with a couple of encode/decode roundtrips.
Dug into it a bit. The bottleneck isn't the base64 codec itself — it's the UTF-16 ↔ UTF-8 conversion that happens on every call. Since Java/Scala strings are UTF-16 internally, every
std.base64(str)has to dostr.getBytes("UTF-8")to get bytes for the encoder, and everystd.base64Decodehas to donew String(bytes, "UTF-8")to produce the result. That's two full copies of the data per operation, going through the charset encoder/decoder.jrsonnet doesn't have this problem because its strings are UTF-8 natively (custom
IStrtype backed by[u8]), so base64 can work directly on the string bytes with zero conversion.For small payloads (a few KB) this doesn't really matter — interpreter overhead dominates. But once you get into the hundreds-of-KB or MB range, the conversion cost adds up fast.
Repro (requires hyperfine + both tools installed):
On my M4 Max (Scala Native build):