Skip to content

EmpiricalSubstitutionModel: normalize empirical frequencies#75

Merged
alexeid merged 1 commit into
masterfrom
empirical-frequencies-normalize
May 13, 2026
Merged

EmpiricalSubstitutionModel: normalize empirical frequencies#75
alexeid merged 1 commit into
masterfrom
empirical-frequencies-normalize

Conversation

@alexeid
Copy link
Copy Markdown
Member

@alexeid alexeid commented May 7, 2026

Summary

  • Empirical amino-acid models (FLU, LG, WAG, etc.) report stationary frequencies rounded to a fixed number of decimal places, so the values typically sum to ~0.9999999 instead of exactly 1.0.
  • Simplex.isValid() uses a 1e-10 tolerance; the rounding error is ~1e-7, so every empirical model fails initialisation when its frequencies are wrapped into a SimplexParam.
  • Fix: normalize before constructing the SimplexParam. No-op when the values already sum to within 1e-10 of 1.0.

Arose while migrating obama where all 16 OBAMA_<aa> empirical models hit this.

Test plan

  • Run an XML that loads one of the OBAMA empirical models (e.g. examples/testOBAMA.xml on the obama beast3-migration branch) and confirm it parses past EmpiricalSubstitutionModel.getEmpericalFrequencieValues().
  • Confirm exact-simplex frequencies (e.g. [0.25, 0.25, 0.25, 0.25]) are unchanged.

Empirical amino-acid models published in the literature (e.g. FLU,
LG, WAG) report stationary frequencies rounded to a fixed number of
decimal places, so the values do not sum to exactly 1.0 (typical
delta ~1e-7). The Simplex isValid() check uses a 1e-10 tolerance,
which rejects them as invalid.

Normalize the values before constructing the SimplexParam so the
empirical frequencies pass Simplex validation. The normalization is
a no-op when the values already sum to within 1e-10 of 1.0.

Surfaced while migrating obama where every OBAMA_<aa> empirical model
hits this on initialisation.
@alexeid alexeid merged commit 3af021b into master May 13, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant