-
Notifications
You must be signed in to change notification settings - Fork 37
VNT Part 4: ValuesAsInModelAccumulator and to_samples #1182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mhauru/vnt-concretized-slices
Are you sure you want to change the base?
Conversation
Benchmark Report
Computer InformationBenchmark Results |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## mhauru/vnt-concretized-slices #1182 +/- ##
==================================================================
- Coverage 80.16% 36.76% -43.41%
==================================================================
Files 42 41 -1
Lines 4356 4464 +108
==================================================================
- Hits 3492 1641 -1851
- Misses 864 2823 +1959 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
DynamicPPL.jl documentation for PR #1182 is available at: |
mhauru
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the process of using VNT for VAIMAcc, I also had to implement values, length, empty, and isempty for VNT, so they are all bundled in this PR.
This would be ready for review if not for an annoying issue with LKJCholesky, which probably requires some special treatment that I haven't figured out yet. In particular, I think the problem stems from an interaction with MCMCChains, which splits a Cholesky variable into the component elements of the the .L field. I'll need to come back to this. Maybe FlexiChains will come in time to save me from having to solve this?
Benchmarks
Code:
Details
module VAIMBench
using DynamicPPL, Distributions, Chairmarks
using StableRNGs: StableRNG
include("benchmarks/src/Models.jl")
using .Models: Models
function run()
rng = StableRNG(23)
smorgasbord_instance = Models.smorgasbord(randn(rng, 100), randn(rng, 100))
loop_univariate1k, multivariate1k = begin
data_1k = randn(rng, 1_000)
loop = Models.loop_univariate(length(data_1k)) | (; o=data_1k)
multi = Models.multivariate(length(data_1k)) | (; o=data_1k)
loop, multi
end
loop_univariate10k, multivariate10k = begin
data_10k = randn(rng, 10_000)
loop = Models.loop_univariate(length(data_10k)) | (; o=data_10k)
multi = Models.multivariate(length(data_10k)) | (; o=data_10k)
loop, multi
end
lda_instance = begin
w = [1, 2, 3, 2, 1, 1]
d = [1, 1, 1, 2, 2, 2]
Models.lda(2, d, w)
end
models = [
("simple_assume_observe", Models.simple_assume_observe(randn(rng))),
("smorgasbord", smorgasbord_instance),
("loop_univariate1k", loop_univariate1k),
("multivariate1k", multivariate1k),
("loop_univariate10k", loop_univariate10k),
("multivariate10k", multivariate10k),
("dynamic", Models.dynamic()),
("parent", Models.parent(randn(rng))),
# ("lda", lda_instance),
]
function print_diff(r, ref)
diff = r.time - ref.time
units = if diff < 1e-6
"ns"
elseif diff < 1e-3
"µs"
else
"ms"
end
diff = if units == "ns"
round(diff / 1e-9; digits=1)
elseif units == "µs"
round(diff / 1e-6; digits=1)
else
round(diff / 1e-3; digits=1)
end
sign = diff < 0 ? "" : "+"
return println(" ($(sign)$(diff) $units)")
end
for (name, m) in models
println()
println(name)
vi = VarInfo(m)
ranges = DynamicPPL.get_ranges_and_linked(vi)
if !(ranges isa Tuple)
ranges = (ranges,)
end
x = vi[:]
strategy = InitFromParams(DynamicPPL.VectorWithRanges{false}(ranges..., x), nothing)
print("Without VAIMAcc: ")
oavi = OnlyAccsVarInfo(
(DynamicPPL.LogPriorAccumulator(), DynamicPPL.LogLikelihoodAccumulator()),
)
wo = @b DynamicPPL.init!!($m, $oavi, $strategy)
display(wo)
print("With VAIMAcc: ")
oavi = OnlyAccsVarInfo(
(
DynamicPPL.LogPriorAccumulator(),
DynamicPPL.LogLikelihoodAccumulator(),
DynamicPPL.ValuesAsInModelAccumulator(false),
),
)
w = @b DynamicPPL.init!!($m, $oavi, $strategy)
show(stdout, MIME"text/plain"(), w)
print_diff(w, wo)
print("Only VAIMAcc: ")
oavi = OnlyAccsVarInfo((DynamicPPL.ValuesAsInModelAccumulator(false),))
o = @b DynamicPPL.init!!($m, $oavi, $strategy)
diff = o.time - wo.time
show(stdout, MIME"text/plain"(), o)
print_diff(o, wo)
end
end
run()
endThis evaluates each model from our benchmark suite 1) with logprob accumulators only, 2) with logprob and VAIM accumulators, 3) with VAIMAcc only. For 2) and 3) I also print the time difference compared to 1). The evaluations are done using the fancy new machinery that FastLDF brought, i.e. VectorWithRanges.
Results on the current release:
simple_assume_observe
Without VAIMAcc: 12.153 ns
With VAIMAcc: 188.554 ns (9 allocs: 384 bytes) (+176.4 ns)
Only VAIMAcc: 186.769 ns (9 allocs: 384 bytes) (+174.6 ns)
smorgasbord
Without VAIMAcc: 5.688 μs (12 allocs: 6.156 KiB)
With VAIMAcc: 77.333 μs (563 allocs: 22.328 KiB) (+71.6 µs)
Only VAIMAcc: 64.000 μs (560 allocs: 21.391 KiB) (+58.3 µs)
loop_univariate1k
Without VAIMAcc: 21.000 μs (8 allocs: 16.172 KiB)
With VAIMAcc: 634.916 μs (7269 allocs: 193.406 KiB) (+613.9 µs)
Only VAIMAcc: 626.167 μs (7269 allocs: 193.406 KiB) (+605.2 µs)
multivariate1k
Without VAIMAcc: 11.250 μs (24 allocs: 80.500 KiB)
With VAIMAcc: 12.541 μs (49 allocs: 89.766 KiB) (+1.3 µs)
Only VAIMAcc: 3.042 μs (36 allocs: 73.359 KiB) (-8208.0 ns)
loop_univariate10k
Without VAIMAcc: 280.500 μs (102 allocs: 194.375 KiB)
With VAIMAcc: 10.346 ms (72680 allocs: 1.999 MiB) (+10.1 ms)
Only VAIMAcc: 10.119 ms (72680 allocs: 1.999 MiB) (+9.8 ms)
multivariate10k
Without VAIMAcc: 110.167 μs (24 allocs: 896.500 KiB)
With VAIMAcc: 111.125 μs (49 allocs: 993.766 KiB) (+958.0 ns)
Only VAIMAcc: 23.167 μs (36 allocs: 801.359 KiB) (-87000.0 ns)
dynamic
Without VAIMAcc: 1.195 μs (14 allocs: 880 bytes)
With VAIMAcc: 2.713 μs (46 allocs: 2.938 KiB) (+1.5 µs)
Only VAIMAcc: 1.901 μs (40 allocs: 2.609 KiB) (+705.5 ns)
parent
Without VAIMAcc: 15.874 ns
With VAIMAcc: 219.672 ns (9 allocs: 384 bytes) (+203.8 ns)
Only VAIMAcc: 205.300 ns (9 allocs: 384 bytes) (+189.4 ns)
Main.VAIMBench
Results on this PR:
simple_assume_observe
Without VAIMAcc: 10.905 ns
With VAIMAcc: 12.288 ns (+1.4 ns)
Only VAIMAcc: 3.226 ns (-7.7 ns)
smorgasbord
Without VAIMAcc: 5.525 μs (12 allocs: 6.156 KiB)
With VAIMAcc: 8.070 μs (236 allocs: 31.297 KiB) (+2.5 µs)
Only VAIMAcc: 3.381 μs (233 allocs: 27.188 KiB) (-2144.0 ns)
loop_univariate1k
Without VAIMAcc: 9.625 μs (6 allocs: 16.125 KiB)
With VAIMAcc: 248.041 μs (9015 allocs: 372.641 KiB) (+238.4 µs)
Only VAIMAcc: 11.791 μs (1023 allocs: 60.406 KiB) (+2.2 µs)
multivariate1k
Without VAIMAcc: 11.500 μs (24 allocs: 80.500 KiB)
With VAIMAcc: 12.667 μs (41 allocs: 89.453 KiB) (+1.2 µs)
Only VAIMAcc: 3.292 μs (28 allocs: 72.969 KiB) (-8208.0 ns)
loop_univariate10k
Without VAIMAcc: 95.958 μs (6 allocs: 192.125 KiB)
With VAIMAcc: 2.514 ms (90023 allocs: 3.733 MiB) (+2.4 ms)
Only VAIMAcc: 110.959 μs (10031 allocs: 697.781 KiB) (+15.0 µs)
multivariate10k
Without VAIMAcc: 109.292 μs (24 allocs: 896.500 KiB)
With VAIMAcc: 110.584 μs (41 allocs: 993.453 KiB) (+1.3 µs)
Only VAIMAcc: 23.083 μs (28 allocs: 800.969 KiB) (-86209.0 ns)
dynamic
Without VAIMAcc: 1.159 μs (14 allocs: 880 bytes)
With VAIMAcc: 2.109 μs (35 allocs: 2.469 KiB) (+949.6 ns)
Only VAIMAcc: 1.393 μs (29 allocs: 2.141 KiB) (+233.1 ns)
parent
Without VAIMAcc: 10.901 ns
With VAIMAcc: 12.397 ns (+1.5 ns)
Only VAIMAcc: 3.203 ns (-7.7 ns)
Main.VAIMBench
The TL;DR is that this improves performance a lot, like 10-100x, for
- Small models.
- Models with
IndexLenses.
For small models this makes using a VAIMAcc go from being the dominant cost of evaluation to being negligible. For big models with heavy likelihood computations this does nothing, since it only affects overheads.
| Base.empty(::VarNamedTuple) = VarNamedTuple() | ||
|
|
||
| """ | ||
| empty!!(vnt::VarNamedTuple) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented empty!! for VNT with the idea that in VAIMAcc we could use it to save allocations, but I actually haven't started using it. Could come back to this at some point as an optimisation.
| function to_dict(::Type{T}, vnt::VarNamedTuple) where {T<:AbstractDict{<:VarName}} | ||
| pairs = splat(Pair).(zip(keys(vnt), values(vnt))) | ||
| return T(pairs...) | ||
| end | ||
| to_dict(vnt::VarNamedTuple) = to_dict(Dict{VarName,Any}, vnt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At one point I thought I would need this function, so I made it, but in the end didn't use it. Kinda inclined to keep it though, I think it'll have a use at some point.
| expected_length = sum(prod ∘ DynamicPPL.varnamesize, keys(vi)) | ||
| @test length(ps.params) == expected_length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our notion of length has changed: Both [@varname(x[1]), @varname(x[2])] and [@varname(x[1:2])] have length 2.
This PR starts using VarNamedTuple for VAIMAcc and to_samples. It also adds new features and fixes to VNT that were needed or useful in the process.