You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PyBNF's prediction-centering convention — whether the deterministic model prediction is interpreted as a noise distribution's mean or median (CONTEXT.md "Prediction Centering"; ADR-0011 location axis, ADR-0024 native surface) — is currently inconsistent across families and ambiguous by default. There is a genuine tension to settle as policy before more capability lands (#419):
PEtab v2 hardcodes the median for every noise model. The exporter-first / importer dogfood (ADR-0023/0025/0026) needs a clear, defensible median story.
Backward compatibility pulls toward the legacy interpretations (the original chi_sq is mean-on-linear; lognormal_var was median-on-log; etc.).
The current state defaults some families to mean and others to median — a confusing mess that we should resolve deliberately rather than per-PR.
This issue is for discussion of the go-forward convention (and how to keep legacy configs working). It deliberately does not prescribe the answer; it gathers the evidence and the options. #419 (implement mean and median for every family) is the capability; this issue is the policy that decides the defaults and the config surface — #419's default/surface choices should gate on the outcome here.
Current state (the inconsistency, precisely)
Legacy objfuncs (each hardcodes its centering):
objfunc
noise model
centering
chi_sq, chi_sq_dynamic
Gaussian() = (LINEAR)
mean
lognormal
Gaussian(LOG10, MEDIAN)
median
laplace
Laplace() = (LINEAR)
median
neg_bin, neg_bin_dynamic
NegBinomial()
mean (prediction is the mean)
Native noise_model family tokens (_NOISE_FAMILIES) and class defaults:
token / class
default centering
normal / gaussian, Gaussian.__init__
mean
lognormal (Gaussian on LOG10)
median
laplace, Laplace.__init__
median
neg_bin, NegBinomial
mean
Global noise_location key (ADR-0024): optional, default unset → falls through to each family's class default (i.e. inherits the inconsistency).
The code comments (location.py, _NOISE_LOCATIONS in objective.py) state median is the default "consistent with PEtab v2", but Gaussian/chi_sq actually default to mean — docs and code disagree on what "the default" is.
On a linear scale the choice is invisible (mean = median for these symmetric families), so the inconsistency is latent today and only surfaces on log scales (lognormal, a future log-Laplace) and for neg_bin (asymmetric count family).
Why it matters
PEtab v2 interop: a round-trip export/import (the dogfood goal) must agree with PEtab's median convention, or silently shift the likelihood.
Legacy reproducibility: users porting old .conf files expect their fits unchanged; flipping a default changes results on log/count models.
Clarity: a single, documented convention (plus an explicit escape hatch) replaces "which family am I, and what does it happen to default to?"
Options to weigh (for discussion)
A. Median-everywhere default (PEtab v2-aligned) + explicit location = mean opt-in. Cleanest forward story; changes legacy behavior wherever mean ≠ median (log-scale Gaussian if anyone used a mean-centered log model; neg_bin).
B. Freeze the current per-family defaults, require nothing, just document. Zero behavior change; preserves the inconsistency permanently.
C. A config-level convention switch (e.g. centering_convention = petab | legacy, or piggyback on a broader petab_compat mode) that selects the default family-by-family: new PEtab v2 configs get median-everywhere, legacy configs are byte-identical to today. This is the "easily tell new-vs-old behavior" idea — it isolates the breaking change behind an explicit opt-in and lets a config self-declare its era.
D. No implicit default where it matters: make locationmandatory-explicit for any (family × scale) where mean ≠ median (all log scales, neg_bin), so ambiguity can never resolve silently; keep the implicit default only where it's a provable no-op (linear symmetric families).
(These are not mutually exclusive — e.g. C + D: a convention switch and an explicit-required rule for the genuinely ambiguous cases.)
Backward-compat analysis (what actually changes)
Linear-scale Gaussian/Laplace (chi_sq, laplace, sos-adjacent): nothing changes under any option (symmetric).
Log-scale Gaussian (lognormal): already median; stays median under A/C. Only a (currently non-existent) mean-centered log model would move.
A .conf era switch (C) makes all of the above opt-in, so no existing file changes unless it declares PEtab-v2 mode.
Acceptance / outcome
A written decision (ADR) that fixes: (1) the go-forward default per (family × scale), (2) the backward-compat mechanism (and whether .conf should carry an explicit era/convention marker), (3) the doc/code reconciliation so "the default" means one thing. #419 then implements the capability under that convention.
Relevant ADRs: 0011 (location axis), 0024 (native location surface + global noise_location + the "median default" intent), 0021 (per-observable noise), 0023/0025/0026 (PEtab v2 interop). Related: #419 (capability), and the per-observable noise work.
Problem
PyBNF's prediction-centering convention — whether the deterministic model prediction is interpreted as a noise distribution's mean or median (CONTEXT.md "Prediction Centering"; ADR-0011 location axis, ADR-0024 native surface) — is currently inconsistent across families and ambiguous by default. There is a genuine tension to settle as policy before more capability lands (#419):
chi_sqis mean-on-linear;lognormal_varwas median-on-log; etc.).This issue is for discussion of the go-forward convention (and how to keep legacy configs working). It deliberately does not prescribe the answer; it gathers the evidence and the options. #419 (implement mean and median for every family) is the capability; this issue is the policy that decides the defaults and the config surface — #419's default/surface choices should gate on the outcome here.
Current state (the inconsistency, precisely)
Legacy objfuncs (each hardcodes its centering):
chi_sq,chi_sq_dynamicGaussian()= (LINEAR)lognormalGaussian(LOG10, MEDIAN)laplaceLaplace()= (LINEAR)neg_bin,neg_bin_dynamicNegBinomial()Native
noise_modelfamily tokens (_NOISE_FAMILIES) and class defaults:normal/gaussian,Gaussian.__init__lognormal(Gaussian on LOG10)laplace,Laplace.__init__neg_bin,NegBinomialGlobal
noise_locationkey (ADR-0024): optional, default unset → falls through to each family's class default (i.e. inherits the inconsistency).Observations:
Gaussian→ mean,Laplace→ median.location.py,_NOISE_LOCATIONSinobjective.py) state median is the default "consistent with PEtab v2", butGaussian/chi_sqactually default to mean — docs and code disagree on what "the default" is.neg_bin(asymmetric count family).Why it matters
.conffiles expect their fits unchanged; flipping a default changes results on log/count models.Options to weigh (for discussion)
location = meanopt-in. Cleanest forward story; changes legacy behavior wherever mean ≠ median (log-scale Gaussian if anyone used a mean-centered log model;neg_bin).centering_convention = petab | legacy, or piggyback on a broaderpetab_compatmode) that selects the default family-by-family: new PEtab v2 configs get median-everywhere, legacy configs are byte-identical to today. This is the "easily tell new-vs-old behavior" idea — it isolates the breaking change behind an explicit opt-in and lets a config self-declare its era.locationmandatory-explicit for any (family × scale) where mean ≠ median (all log scales,neg_bin), so ambiguity can never resolve silently; keep the implicit default only where it's a provable no-op (linear symmetric families).(These are not mutually exclusive — e.g. C + D: a convention switch and an explicit-required rule for the genuinely ambiguous cases.)
Backward-compat analysis (what actually changes)
chi_sq,laplace,sos-adjacent): nothing changes under any option (symmetric).lognormal): already median; stays median under A/C. Only a (currently non-existent) mean-centered log model would move.neg_bin: defaults to mean today; option A would flip it to median (a real change to the count likelihood). This is the concrete decision flagged in Mean/median prediction-centering for every noise model (complete the location axis; neg_bin median) #419..confera switch (C) makes all of the above opt-in, so no existing file changes unless it declares PEtab-v2 mode.Acceptance / outcome
A written decision (ADR) that fixes: (1) the go-forward default per (family × scale), (2) the backward-compat mechanism (and whether
.confshould carry an explicit era/convention marker), (3) the doc/code reconciliation so "the default" means one thing. #419 then implements the capability under that convention.Relevant ADRs: 0011 (location axis), 0024 (native
locationsurface + globalnoise_location+ the "median default" intent), 0021 (per-observable noise), 0023/0025/0026 (PEtab v2 interop). Related: #419 (capability), and the per-observable noise work.