Conversation
There was a problem hiding this comment.
-
I think this is a good idea. It's possible that this doesn't actually break any packages that depend on loo. I can check by running reverse dependency checks once the errors in the code are fixed (see individual comments).
-
The tests need to be updated, but we can wait to do that until we check that it doesn't break any reverse dependencies
-
We should put some information about this in main
loo_comparedocumentation, not only the print method -
I actually think it would be good to print these probabilities by default. What was the reason you turned it off by default? (Actually right now the printing will error by default, but once fixed it will be off by default). I don't think the printing should affect backwards compatibility (or am I missing something?). It would be adding more columns could potentially affect something, right? Or was there a different reason you turned off printing the probabilities by default?
Co-authored-by: Jonah Gabry <jgabry@gmail.com>
|
Ok, let's change them to be printed by default. I had the column name as p_worse so that it will also tell direction. Then I considered to switching to pnorm as it tells that it's based on the normal approximation, and then did get betweeb |
|
Setting |
|
I would love to get rid of |
| diag_pnorm[khat_diff > 0.5] <- paste0("khat_diff > 0.5") | ||
| } | ||
| rownames(comp) <- rnms | ||
| comp <- cbind(data.frame(elpd_diff = elpd_diff, se_diff = se_diff, |
There was a problem hiding this comment.
This also changes the returned object to a data frame when it used to be a matrix. That could potentially cause reverse dependency issues, but we'll see when I run the checks. This is a necessary change, though, since we're mixing numeric columns with text columns for the diagnostic.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #300 +/- ##
==========================================
- Coverage 92.75% 92.70% -0.06%
==========================================
Files 31 31
Lines 2983 3029 +46
==========================================
+ Hits 2767 2808 +41
- Misses 216 221 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I fixed the tests so this is passing now. I will run reverse dependency checks over the weekend hopefully. I just put one question for you @avehtari in the review comments about how to label to diagnostic. |
|
Starting a comment to keep track of reverse dependency issues. The reverse dependency checks are still running (they will take a while). I will update here as I discover new issues.
|
|
|
We discussed this PR in our group meeting and I think at this point we should drop |
|
I have updated a case study using this PR https://users.aalto.fi/~ave/casestudies/LOO_uncertainty/loo_uncertainty_diag.html There are now more explanations on how to interpret the diagnostics |
|
After a discussion with @avehtari, we agreed on the following next steps for this PR:
|
|
I added additional sections in the |
avehtari
left a comment
There was a problem hiding this comment.
Excellent loo-glossary text. Just one comment about Bengio and Grandvalet reference
|
@florence-bockting @avehtari Thank you in advance for doing the new round of reverse dependency checks. |
|
Re-running reverse dependency checks. In the following a list of packages where issues occur:
@avehtari: I have now worked out fixes for all affected libraries which are currently still local. When I understand the process correctly, then we can now merge this PR into master and I will make the PRs in the respective packages public. Or some other thoughts on this? |
|
@VisruthSK The touchstone build process fails in this repo. Can you please have a look at this? Thank you. |
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if b54fa3e is merged into master:
|
Description
The corresponding issue #299
In this PR we are making changes to the output structure returned by
loo_compare(). The main changes are a transition from a matrix to a data frame and the addition of several new columns. Here is an example of the updated output structure:More context on the motivation behind these changes can be found in the updated case study "Uncertainty in Bayesian LOO-CV Model Comparison".
TODOs