Skip to content

Export srs_diff_est()#340

Open
vinniott wants to merge 76 commits intostan-dev:masterfrom
vinniott:export-srs-diff-est
Open

Export srs_diff_est()#340
vinniott wants to merge 76 commits intostan-dev:masterfrom
vinniott:export-srs-diff-est

Conversation

@vinniott
Copy link
Copy Markdown
Contributor

Fixes #333

Note:

  • There is no @example yet because I did not have time yet to get fully familiar with the whole package.
  • I updated NEWS.md as suggested in CONTRIBUTING.md but I am not sure whether I did that correctly.

@vinniott vinniott mentioned this pull request Mar 22, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 22, 2026

Codecov Report

❌ Patch coverage is 91.30435% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.72%. Comparing base (7eafeb8) to head (1cdb60f).
⚠️ Report is 47 commits behind head on master.

Files with missing lines Patch % Lines
R/example_log_lik_objects.R 0.00% 1 Missing ⚠️
R/print.R 91.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #340      +/-   ##
==========================================
- Coverage   92.78%   92.72%   -0.06%     
==========================================
  Files          31       31              
  Lines        2992     2984       -8     
==========================================
- Hits         2776     2767       -9     
- Misses        216      217       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jgabry
Copy link
Copy Markdown
Member

jgabry commented Mar 23, 2026

Thank you @vinniott.

There is no @example yet because I did not have time yet to get fully familiar with the whole package.

@avehtari or @MansMeg, is there any specific example you'd like to use for this in the documentation?

@avehtari
Copy link
Copy Markdown
Member

The example should be based on flexible enough model so that elpd(log_lik_matrix) and loo(log_lik_matrix) differ more than by 1. The current example_loglik_matrix() has too few observations. The example used in tests for subsampling is one parameter model. Should we have a real model, or store another example loglik matrix?

After we have useful loglik matrix, the example code would be something like

# Use posterior predictive density as the fast but biased method for all observations
lpd <- elpd(log_lik_matrix)
sum(lpd$pointwise[,"elpd"])

# Use PSIS-LOO for subsample of 50 randomly selected observations
idx <- sample(1:N, 50)
elpd_loo_sub <- loo(log_lik_matrix[,idx])
20 * sum(elpd_loo_sub$pointwise[,"elpd_loo"])

# Use difference estimator to combine fast result and subsampled accurate result
loo:::srs_diff_est(lpd$pointwise[,"elpd"], elpd_loo_sub$pointwise[,"elpd_loo"], idx)

# Comparison to using PSIS-LOO for all observations
loo(log_lik_matrix)

This matches what someone was asking

@jgabry
Copy link
Copy Markdown
Member

jgabry commented Mar 23, 2026

Should we have a real model, or store another example loglik matrix?

Either is fine by me. Also if we're only using it for this example, we could also just generate an example loglik matrix in the example code instead of storing it.

@avehtari
Copy link
Copy Markdown
Member

I think the interesting examples can be slow to run. I'll test subsampling with few interesting real models this week

@avehtari
Copy link
Copy Markdown
Member

avehtari commented Mar 27, 2026

Thus would be a good example with data from https://archive.ics.uci.edu/ml/datasets/wine+quality

library(dplyr)
library(brms)
options(brms.backend = "cmdstanr")
options(mc.cores = 4)
library(loo)

wine <- read.delim(root("winequality-red", "winequality-red.csv"), sep = ";") |>
  distinct()

wine_scaled <- as.data.frame(scale(wine))

fitos <- brm(ordered(quality) ~ .,
            family = cumulative("logit"),
            prior = prior(R2D2(mean_R2 = 1/3, prec_R2 = 3)),
            data = wine_scaled,
            seed = 1,
            silent = 2,
            refresh = 0)

log_lik_matrix <- log_lik(fitos)

N <- nrow(wine_scaled)
Nsub <- 100

# posterior log-score
lpd <- elpd(log_lik_matrix)
sum(lpd$pointwise[,"elpd"])

# Use PSIS-LOO for subsample of Nsub randomly selected observations
set.seed(1)
idx <- sample(1:N, Nsub)
elpd_loo_sub <- loo(log_lik_matrix[,idx])
sum(elpd_loo_sub$pointwise[,"elpd_loo"]) / Nsub * N

# Use difference estimator to combine fast result and subsampled accurate result
loo:::srs_diff_est(lpd$pointwise[,"elpd"], elpd_loo_sub$pointwise[,"elpd_loo"], idx)

# Comparison to using PSIS-LOO for all observations
loo(log_lik_matrix)
  • p_loo is here about 17 and thus posterior log-score is clearly different
  • N is 1359, so that a subsample of 100 is still only small part of all observations
  • No high Pareto-k values to complicate things
  • Subsampling with Nsub gets close to the full result

As compiling and sampling the brms model takes some time, I would store only the log_lik_matrix but show the code for how it is generated. The rest of code is fast

Florence Bockting and others added 7 commits March 27, 2026 13:16
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 5 to 6.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@v5...v6)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
…ns/codecov/codecov-action-6

Bump codecov/codecov-action from 5 to 6
@vinniott
Copy link
Copy Markdown
Contributor Author

vinniott commented Apr 1, 2026

Interesting, do I understand it correctly that the log_lik_matrix.rda would be stored in loo/data ?
I will try to continue trying to implement this example towards the end of next week.

@avehtari
Copy link
Copy Markdown
Member

Instead of generating log_lik on the fly, we could use the stored wine log_lik which was added in #352. This would make the example to run much faster. Need to check whether wine log_lik was added only for touchstone and was it too big for CRAN, @jgabry , @VisruthSK

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 26, 2026

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 1cdb60f is merged into master:

  • ✔️loo_function: 1.78s -> 1.78s [-0.54%, +0.29%]
  • ✔️loo_matrix: 1.29s -> 1.29s [-0.45%, +0.5%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

@vinniott
Copy link
Copy Markdown
Contributor Author

The tests are failing so I will give a short summary of what I did.
First, I fetched all upstream updates and added them to this export-srs-diff-est feature branch of the PR.

Second, I renamed generate-example_loglik-array.R to generate-example_loglik-objects.R and also
example_log_lik_array.R to example_log_lik_objects.R.
Here, I added the .example_wine_loglik_matrix to sysdata.rda as you suggested above @jgabry

Third, I added the example by @avehtari to srs_diff_est().

Fourth, I did devtools::document(), devtools::install(), and when I then ran the example of ?srs_diff_est()
it worked on my machine (e.g., screenshot from my RStudio):
Screenshot 2026-04-26 at 14 38 40

So I hope that I shouldnt be too far from getting the tests to pass. Do you have an idea why they might fail?

Instead of generating log_lik on the fly, we could use the stored wine log_lik which was added in #352.

Of course also an option! I just wanted give the sysdata.rda implementation a try as I was close to finishing it when you pointed this out. Though, the tests currently do not pass, of course.

@VisruthSK
Copy link
Copy Markdown
Member

Instead of generating log_lik on the fly, we could use the stored wine log_lik which was added in #352. This would make the example to run much faster. Need to check whether wine log_lik was added only for touchstone and was it too big for CRAN, @jgabry , @VisruthSK

Yes to both. Clocks in at about 40MB, and is currently only in the touchstone directory.

@VisruthSK
Copy link
Copy Markdown
Member

Hi @vinniott! Thanks for working on this.

So I hope that I shouldnt be too far from getting the tests to pass. Do you have an idea why they might fail?

I think we should figure out how we're storing/using the wine data before we fix the tests.

Apologize if you already know this, but if you click on the failed runs, you'll be able to scroll down till you see R CMD check output. Locally, you can run devtools::check() which will take a long time, but will approximate the same tests that are being run here. Here's a link to the main issue in this PR right now--I don't think the wine data is available for the package, so the example is failing when trying to find it. You can see that again further down, L216, when R CMD check is complaining about not finding a binding for .example_wine_loglik_matrix.

Fourth, I did devtools::document(), devtools::install(), and when I then ran the example of ?srs_diff_est() it worked on my machine

I think the problem here might be that you ran some code locally before that to make .example_wine_loglik_matrix populate in your local environment. There's a way to restart R sessions in RStudio I think, and in the future if you run into a bug like this where things are okay locally but failing for someone else/on a runner, it might help if you try running the tests in a fresh R session.

Hope that slightly clears up why the tests are failing, and how to (hopefully) reproduce those failures locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

export srs_diff_est

5 participants