Add geometric distribution (lpmf, cdf, lcdf, lccdf, rng)#3299
Add geometric distribution (lpmf, cdf, lcdf, lccdf, rng)#3299yasumorishima wants to merge 13 commits intostan-dev:developfrom
Conversation
|
Hi @yasumorishima ! Is there a reason we can't just have direct calls to the negative binomial function for all of these? So these would just be calls to neg_binomial that fix |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
f822a16 to
84319ae
Compare
|
Hi @SteveBronder, thanks for the review! Good point — I've refactored all five functions (lpmf, cdf, lcdf, lccdf, rng) The only special case is θ = 1, where β diverges. Each function handles this I also added autodiff test HPPs for lpmf, lcdf, and lccdf — all 1,412 tests Let me know if you'd like any further changes! |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
Replace neg_binomial_* delegation with a full autodiff implementation per PR stan-dev#3299 review. All four functions compute analytic partials on Eigen arrays directly, with explicit boundary handling for theta=0, theta=1, n=0, n=INT_MAX. Tests: 180/180 autodiff + 8/8 prim pass.
|
Thanks for the detailed review @SteveBronder! Went with option 2 — all four functions now use Boundary handling: short-circuit Tests: 180/180 autodiff distribution tests (var/vv/ffv) + 8/8 prim unit tests pass locally. Happy to iterate. |
Address CodeRabbit review on PR stan-dev#3299: with check_bounded(theta, 0.0, 1.0) (inclusive), passing theta = 0 produced -inf logp but the autodiff path still evaluated inv(theta_arr), poisoning the partial with +inf. Replace the inclusive bound with check_positive_finite + check_less_or_ equal in lpmf/cdf/lcdf/lccdf to match the documented domain (0, 1] and the existing geometric_rng pattern. With the tightened bound the theta == 0 short-circuits in cdf/lcdf become unreachable; remove them. Add an EXPECT_THROW unit test covering theta = 0 across all four distribution functions to lock in the new behavior.
|
Addressed @coderabbitai's review on In commit 31dcbd6:
Local results: prim unit test 9/9 PASS (incl. the new throw test), all 12 generated autodiff distribution test binaries PASS. |
|
Round 2 of CodeRabbit findings + an alignment bug Opus surfaced during self-review (commit
Tests (
prim 11/11 PASS, autodiff distribution lccdf v/vv/ffv 90/180/90 PASS. |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
Implements the geometric distribution as requested in stan-dev#3098. The geometric distribution models the number of failures before the first success, with PMF: P(n|theta) = theta * (1-theta)^n where n in {0, 1, 2, ...} and theta in (0, 1]. This uses the number-of-failures parameterization, consistent with the geometric distribution being a special case of the negative binomial distribution with r=1 (neg_binomial(1, theta/(1-theta))). Verified: geometric_lpmf(n, theta) == neg_binomial_lpmf(n, 1, theta/(1-theta)) for all tested values. Includes: - geometric_lpmf: log probability mass function with autodiff - geometric_cdf: cumulative distribution function with autodiff - geometric_lcdf: log CDF with autodiff - geometric_lccdf: log complementary CDF with autodiff - geometric_rng: random number generation via inverse CDF method - Unit tests for all functions including distribution checks - CDF test fixture for gradient verification Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Guard gradient computations against theta=0 division by zero in geometric_lpmf, geometric_cdf, and geometric_lcdf - Fix include ordering in prob.hpp (gaussian before geometric) - Fix signed/unsigned comparison in geometric_rng loop index Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Refactor lpmf, cdf, lcdf, lccdf, and rng to delegate to the corresponding neg_binomial functions with alpha=1 and beta=theta/(1-theta), as suggested in review. Handle theta=1 (where beta diverges) with early returns. Add autodiff test HPPs for lpmf, lcdf, and lccdf.
Replace neg_binomial_* delegation with a full autodiff implementation per PR stan-dev#3299 review. All four functions compute analytic partials on Eigen arrays directly, with explicit boundary handling for theta=0, theta=1, n=0, n=INT_MAX. Tests: 180/180 autodiff + 8/8 prim pass.
Address CodeRabbit review on PR stan-dev#3299: with check_bounded(theta, 0.0, 1.0) (inclusive), passing theta = 0 produced -inf logp but the autodiff path still evaluated inv(theta_arr), poisoning the partial with +inf. Replace the inclusive bound with check_positive_finite + check_less_or_ equal in lpmf/cdf/lcdf/lccdf to match the documented domain (0, 1] and the existing geometric_rng pattern. With the tightened bound the theta == 0 short-circuits in cdf/lcdf become unreachable; remove them. Add an EXPECT_THROW unit test covering theta = 0 across all four distribution functions to lock in the new behavior.
geometric_lccdf: - Drop the any(n_arr < 0) early return that silently dropped contributions from positive n_i in mixed-sign vectors. P(N > n_i) = 1 for n_i < 0 is the multiplicative identity (log 0), not an absorbing element, so the term has to remain in the sum. Per-element select(n < 0, 0, ...) keeps both value and partials paths correct. - Replace the any(theta == 1.0) blanket guard with a scalar_seq_view per-element loop, so theta=1 only forces -inf when paired with n_i >= 0 at the same index. Previously theta=[0.5, 1.0], n=[2, -1] returned -inf instead of 3*log(0.5). geometric_lpmf: - Add the is_stan_scalar_v<T_prob> guard around partials<0> assignment so a scalar theta paired with a vector n collapses to a scalar partial via sum(...), matching cdf/lcdf/lccdf. Tests: lccdf_vectorized_mixed_sign and lccdf_vectorized_theta_one_alignment lock in both contract fixes.
97cb3ef to
fa62f3b
Compare
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
| // log_q = log((1 - theta)^(n + 1)) = (n + 1) * log1m(theta) | ||
| // log P_i = log(1 - q_i) = log1m_exp(log_q_i) | ||
| // For theta = 1: log_q = -inf, log1m_exp(-inf) = log(1 - 0) = 0 | ||
| // (correct: P = 1 when success is certain). |
There was a problem hiding this comment.
For all the files here, it is nice to have docs for the gradients, but we normally put these as latex in the doxygen doc
Summary
Implements the geometric distribution as requested in #3098.
geometric_lpmf(n, θ) == neg_binomial_lpmf(n, 1, θ/(1−θ))— verified numericallyFiles added/modified
stan/math/prim/prob/geometric_lpmf.hpp— log PMF with autodiff gradientsstan/math/prim/prob/geometric_cdf.hpp— CDF with autodiff gradientsstan/math/prim/prob/geometric_lcdf.hpp— log CDF with autodiff gradientsstan/math/prim/prob/geometric_lccdf.hpp— log CCDF with autodiff gradientsstan/math/prim/prob/geometric_rng.hpp— RNG via inverse CDF methodstan/math/prim/prob.hpp— include additions (alphabetical order)test/unit/math/prim/prob/geometric_test.cpp— unit tests (8 tests)test/prob/geometric/geometric_cdf_test.hpp— CDF gradient test fixtureTests
errorCheck— validates parameter constraints via test rigdistributionCheck— RNG samples match PMF (chi-squared test)error_check— domain errors for invalid θlpmf_values— hand-calculated PMF values including θ=1 edge casecdf_values— hand-calculated CDF valuescdf_below_support— returns 0 / -inf for n < 0lccdf_values— hand-calculated CCDF valuesrng_theta_one— θ=1 always returns 0Notes
Closes #3098
🤖 Generated with Claude Code