Skip to content

Add KEEP model with frequency-aware regularization and MIMIC-IV ablation study#982

Open
jasonccfok wants to merge 1 commit intosunlabuiuc:masterfrom
jasonccfok:feature/keep-model-ablation
Open

Add KEEP model with frequency-aware regularization and MIMIC-IV ablation study#982
jasonccfok wants to merge 1 commit intosunlabuiuc:masterfrom
jasonccfok:feature/keep-model-ablation

Conversation

@jasonccfok
Copy link
Copy Markdown

Contributor: Fok Chun Chung (ccfok2@illinois.edu)

Contribution Type: Model (Reproducibility + Extension)

Paper:
Ahmed Elhussein, Paul Meddeb, Abigail Newbury, Jeanne Mirone,
Martin Stoll, Gamze Gursoy.
"KEEP: Integrating Medical Ontologies with Clinical Data for Robust Code Embeddings."
Proceedings of Machine Learning Research (PMLR), vol. 287, pp. 1–19, 2025.
https://arxiv.org/abs/2510.05049

Description:
This PR implements KEEP (Knowledge-Preserving and Empirically Refined
Embedding Process) as described in the above paper.

KEEP integrates ontology-derived embeddings with empirical
co-occurrence learning to produce robust medical code embeddings
without task-specific end-to-end training.

This implementation includes:

  • Lightweight co-occurrence-based embedding pretraining
  • Ontology-inspired regularization
  • A frequency-aware regularization extension
    (λ_i = λ / sqrt(freq_i + 1)) to improve rare-code robustness
  • Supervised readmission prediction head
  • Full compatibility with PyHealth Trainer API

An ablation study is provided in:
examples/mimic4_readmission_keep.py

The ablation evaluates:

  • No regularization (λ=0)
  • Standard KEEP regularization
  • Frequency-aware regularization (proposed extension)
  • Two embedding dimensions (64, 128)

Metrics reported:
AUROC, AUPRC, F1, Accuracy

Comprehensive unit tests using synthetic data are included
to verify:

  • Model instantiation
  • Forward pass correctness
  • Output shape validity
  • Gradient computation

Files to Review:

  • pyhealth/models/keep.py
  • tests/core/test_keep.py
  • examples/mimic4_readmission_keep.py
  • docs/api/models/pyhealth.models.keep.rst

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant