Export srs_diff_est() by vinniott · Pull Request #340 · stan-dev/loo

vinniott · 2026-03-22T11:27:41Z

Fixes #333

Note:

There is no @example yet because I did not have time yet to get fully familiar with the whole package.
I updated NEWS.md as suggested in CONTRIBUTING.md but I am not sure whether I did that correctly.

synced with upstream/master

@avehtari

as proposed by @avehtari in issue stan-dev#333

codecov-commenter · 2026-03-22T14:46:37Z

Codecov Report

❌ Patch coverage is 91.30435% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.72%. Comparing base (7eafeb8) to head (1cdb60f).
⚠️ Report is 47 commits behind head on master.

Files with missing lines	Patch %	Lines
R/example_log_lik_objects.R	0.00%	1 Missing ⚠️
R/print.R	91.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #340      +/-   ##
==========================================
- Coverage   92.78%   92.72%   -0.06%     
==========================================
  Files          31       31              
  Lines        2992     2984       -8     
==========================================
- Hits         2776     2767       -9     
- Misses        216      217       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jgabry · 2026-03-23T17:16:37Z

Thank you @vinniott.

There is no @example yet because I did not have time yet to get fully familiar with the whole package.

@avehtari or @MansMeg, is there any specific example you'd like to use for this in the documentation?

avehtari · 2026-03-23T18:13:27Z

The example should be based on flexible enough model so that elpd(log_lik_matrix) and loo(log_lik_matrix) differ more than by 1. The current example_loglik_matrix() has too few observations. The example used in tests for subsampling is one parameter model. Should we have a real model, or store another example loglik matrix?

After we have useful loglik matrix, the example code would be something like

# Use posterior predictive density as the fast but biased method for all observations
lpd <- elpd(log_lik_matrix)
sum(lpd$pointwise[,"elpd"])

# Use PSIS-LOO for subsample of 50 randomly selected observations
idx <- sample(1:N, 50)
elpd_loo_sub <- loo(log_lik_matrix[,idx])
20 * sum(elpd_loo_sub$pointwise[,"elpd_loo"])

# Use difference estimator to combine fast result and subsampled accurate result
loo:::srs_diff_est(lpd$pointwise[,"elpd"], elpd_loo_sub$pointwise[,"elpd_loo"], idx)

# Comparison to using PSIS-LOO for all observations
loo(log_lik_matrix)

This matches what someone was asking

jgabry · 2026-03-23T20:50:06Z

Should we have a real model, or store another example loglik matrix?

Either is fine by me. Also if we're only using it for this example, we could also just generate an example loglik matrix in the example code instead of storing it.

avehtari · 2026-03-24T18:21:21Z

I think the interesting examples can be slow to run. I'll test subsampling with few interesting real models this week

avehtari · 2026-03-27T10:38:50Z

Thus would be a good example with data from https://archive.ics.uci.edu/ml/datasets/wine+quality

library(dplyr)
library(brms)
options(brms.backend = "cmdstanr")
options(mc.cores = 4)
library(loo)

wine <- read.delim(root("winequality-red", "winequality-red.csv"), sep = ";") |>
  distinct()

wine_scaled <- as.data.frame(scale(wine))

fitos <- brm(ordered(quality) ~ .,
            family = cumulative("logit"),
            prior = prior(R2D2(mean_R2 = 1/3, prec_R2 = 3)),
            data = wine_scaled,
            seed = 1,
            silent = 2,
            refresh = 0)

log_lik_matrix <- log_lik(fitos)

N <- nrow(wine_scaled)
Nsub <- 100

# posterior log-score
lpd <- elpd(log_lik_matrix)
sum(lpd$pointwise[,"elpd"])

# Use PSIS-LOO for subsample of Nsub randomly selected observations
set.seed(1)
idx <- sample(1:N, Nsub)
elpd_loo_sub <- loo(log_lik_matrix[,idx])
sum(elpd_loo_sub$pointwise[,"elpd_loo"]) / Nsub * N

# Use difference estimator to combine fast result and subsampled accurate result
loo:::srs_diff_est(lpd$pointwise[,"elpd"], elpd_loo_sub$pointwise[,"elpd_loo"], idx)

# Comparison to using PSIS-LOO for all observations
loo(log_lik_matrix)

p_loo is here about 17 and thus posterior log-score is clearly different
N is 1359, so that a subsample of 100 is still only small part of all observations
No high Pareto-k values to complicate things
Subsampling with Nsub gets close to the full result

As compiling and sampling the brms model takes some time, I would store only the log_lik_matrix but show the code for how it is generated. The rest of code is fast

Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 5 to 6. - [Release notes](https://github.com/codecov/codecov-action/releases) - [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md) - [Commits](codecov/codecov-action@v5...v6) --- updated-dependencies: - dependency-name: codecov/codecov-action dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

…ns/codecov/codecov-action-6 Bump codecov/codecov-action from 5 to 6

vinniott · 2026-04-01T19:07:58Z

Interesting, do I understand it correctly that the log_lik_matrix.rda would be stored in loo/data ?
I will try to continue trying to implement this example towards the end of next week.

Further bump up touchstone iterations

Latest R and LTS ubuntu

Bumped touchstone reqs

Update posterior gpdfit branch to get newer touchstone functionality

use posterior::gpdfit and posterior::qgeneralized_pareto()

avehtari · 2026-04-26T09:49:26Z

Instead of generating log_lik on the fly, we could use the stored wine log_lik which was added in #352. This would make the example to run much faster. Need to check whether wine log_lik was added only for touchstone and was it too big for CRAN, @jgabry , @VisruthSK

@avehtari

as proposed by @avehtari in issue stan-dev#333

…into export-srs-diff-est

github-actions · 2026-04-26T11:02:30Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 1cdb60f is merged into master:

✔️loo_function: 1.78s -> 1.78s [-0.54%, +0.29%]
✔️loo_matrix: 1.29s -> 1.29s [-0.45%, +0.5%]
Further explanation regarding interpretation and methodology can be found in the documentation.

vinniott · 2026-04-26T12:44:41Z

The tests are failing so I will give a short summary of what I did.
First, I fetched all upstream updates and added them to this export-srs-diff-est feature branch of the PR.

Second, I renamed generate-example_loglik-array.R to generate-example_loglik-objects.R and also
example_log_lik_array.R to example_log_lik_objects.R.
Here, I added the .example_wine_loglik_matrix to sysdata.rda as you suggested above @jgabry

Third, I added the example by @avehtari to srs_diff_est().

Fourth, I did devtools::document(), devtools::install(), and when I then ran the example of ?srs_diff_est()
it worked on my machine (e.g., screenshot from my RStudio):

So I hope that I shouldnt be too far from getting the tests to pass. Do you have an idea why they might fail?

Instead of generating log_lik on the fly, we could use the stored wine log_lik which was added in #352.

Of course also an option! I just wanted give the sysdata.rda implementation a try as I was close to finishing it when you pointed this out. Though, the tests currently do not pass, of course.

VisruthSK · 2026-04-27T15:10:43Z

Instead of generating log_lik on the fly, we could use the stored wine log_lik which was added in #352. This would make the example to run much faster. Need to check whether wine log_lik was added only for touchstone and was it too big for CRAN, @jgabry , @VisruthSK

Yes to both. Clocks in at about 40MB, and is currently only in the touchstone directory.

VisruthSK · 2026-04-27T15:27:07Z

Hi @vinniott! Thanks for working on this.

So I hope that I shouldnt be too far from getting the tests to pass. Do you have an idea why they might fail?

I think we should figure out how we're storing/using the wine data before we fix the tests.

Apologize if you already know this, but if you click on the failed runs, you'll be able to scroll down till you see R CMD check output. Locally, you can run devtools::check() which will take a long time, but will approximate the same tests that are being run here. Here's a link to the main issue in this PR right now--I don't think the wine data is available for the package, so the example is failing when trying to find it. You can see that again further down, L216, when R CMD check is complaining about not finding a binding for .example_wine_loglik_matrix.

Fourth, I did devtools::document(), devtools::install(), and when I then ran the example of ?srs_diff_est() it worked on my machine

I think the problem here might be that you ran some code locally before that to make .example_wine_loglik_matrix populate in your local environment. There's a way to restart R sessions in RStudio I think, and in the future if you run into a bug like this where things are okay locally but failing for someone else/on a runner, it might help if you try running the tests in a fresh R session.

Hope that slightly clears up why the tests are failing, and how to (hopefully) reproduce those failures locally.

avehtari and others added 15 commits October 17, 2025 19:54

use posterior::gpdfit

9f9e9ad

use posterior::qgeneralized_pareto()g

05c69e1

delete gpd functions

aae6c17

remove gpdfit doc and export

099398b

set up documentation structure

84ee41f

srs_diff_est.Rd matches .R documentation

7d2c817

Merge branch 'master' into export-srs-diff-est

a914fc6

synced with upstream/master

added documentation

816bcf8

as proposed by @avehtari in issue stan-dev#333

added @Seealso at loo_subsample()

e596847

added reference Cochran (1977)

25fddcf

removed oudated @return duplicate

fe5a45e

corrected .R formulas to render in .Rd

dd72938

removed example placeholder

287c039

updated .Rd to match .R

d4dda71

Update NEWS.md

5d476b5

vinniott mentioned this pull request Mar 22, 2026

export srs_diff_est #333

Open

Florence Bockting and others added 7 commits March 27, 2026 13:16

feat: update print.loo to support kfold pareto-k diagnostics

e0a8bcb

tests: add unittest for updated print method and test data

0efb8b1

refactor: create new kfold.print method instead of changing print.loo

5145c66

tests: use expect_snapshot to check table output

9bd9ff0

tests: update test data and snapshot for kfold-print tests

97c2f90

Merge pull request stan-dev#344 from stan-dev/dependabot/github_actio…

23f2117

…ns/codecov/codecov-action-6 Bump codecov/codecov-action from 5 to 6

VisruthSK and others added 9 commits April 16, 2026 09:25

Update script.R

e7f64e9

Merge pull request stan-dev#357 from stan-dev/even-more-touchstone-iters

57bd12a

Further bump up touchstone iterations

Bumped touchstone reqs

d77578b

Latest R and LTS ubuntu

Update script.R

70a04ed

Merge pull request stan-dev#358 from stan-dev/testing-touchstone-comment

df56db4

Bumped touchstone reqs

Merge pull request stan-dev#359 from stan-dev/master

8d83fc2

Update posterior gpdfit branch to get newer touchstone functionality

Merge pull request stan-dev#305 from stan-dev/use-posterior-gpdfit

85db294

use posterior::gpdfit and posterior::qgeneralized_pareto()

added @examples placeholder

3d589f8

added generation of wine example

886d92f

vinniott added 15 commits April 26, 2026 11:51

rename example-generate_loglik_objects.R

ec6a3eb

set up documentation structure

7771f1f

srs_diff_est.Rd matches .R documentation

1f26a6f

added documentation

1eb7ac1

as proposed by @avehtari in issue stan-dev#333

added @Seealso at loo_subsample()

9a6fe92

added reference Cochran (1977)

24332a0

removed oudated @return duplicate

41b0ab0

corrected .R formulas to render in .Rd

57b0615

removed example placeholder

79e838f

updated .Rd to match .R

71b45cc

Update NEWS.md

bb8fc8b

added @examples placeholder

6ce1d15

added generation of wine example

19bb48d

rename example-generate_loglik_objects.R

607f938

Merge branch 'export-srs-diff-est' of https://github.com/vinniott/loo …

9dc166f

…into export-srs-diff-est

general loglik example file with wine export for srs_diff_est example

1cdb60f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Export srs_diff_est()#340

Export srs_diff_est()#340
vinniott wants to merge 76 commits intostan-dev:masterfrom
vinniott:export-srs-diff-est

vinniott commented Mar 22, 2026

Uh oh!

codecov-commenter commented Mar 22, 2026 •

edited

Loading

Uh oh!

jgabry commented Mar 23, 2026

Uh oh!

avehtari commented Mar 23, 2026

Uh oh!

jgabry commented Mar 23, 2026

Uh oh!

avehtari commented Mar 24, 2026

Uh oh!

avehtari commented Mar 27, 2026 •

edited

Loading

Uh oh!

vinniott commented Apr 1, 2026

Uh oh!

avehtari commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

vinniott commented Apr 26, 2026

Uh oh!

VisruthSK commented Apr 27, 2026

Uh oh!

VisruthSK commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

vinniott commented Mar 22, 2026

Uh oh!

codecov-commenter commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jgabry commented Mar 23, 2026

Uh oh!

avehtari commented Mar 23, 2026

Uh oh!

jgabry commented Mar 23, 2026

Uh oh!

avehtari commented Mar 24, 2026

Uh oh!

avehtari commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vinniott commented Apr 1, 2026

Uh oh!

avehtari commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vinniott commented Apr 26, 2026

Uh oh!

VisruthSK commented Apr 27, 2026

Uh oh!

VisruthSK commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-commenter commented Mar 22, 2026 •

edited

Loading

avehtari commented Mar 27, 2026 •

edited

Loading

github-actions Bot commented Apr 26, 2026 •

edited

Loading