feat(models): Add L1 regularization support to LogisticRegression by vtewari2 · Pull Request #960 · sunlabuiuc/PyHealth

vtewari2 · 2026-04-11T15:07:25Z

Summary

Adds an optional l1_lambda parameter to LogisticRegression that appends
a sparsity-inducing L1 penalty on the final linear layer's weights to the
training loss:

loss = BCE(logits, y_true) + l1_lambda * ‖fc.weight‖₁

The default is l1_lambda=0.0, making this fully backward-compatible —
existing code that instantiates LogisticRegression without the parameter
is unaffected.

Motivation

The current LogisticRegression model supports unregularised logistic
regression only. Many clinical prediction tasks — particularly those with
high-dimensional, sparse binary feature spaces — benefit from L1
regularisation to produce interpretable, sparse weight vectors. This is the
standard approach in clinical ML literature (e.g. LASSO logistic regression)
and was specifically used in:

Boag et al. "Racial Disparities and Mistrust in End-of-Life Care."
MLHC 2018. arXiv:1808.03827

Parameter equivalence to scikit-learn

For users migrating from scikit-learn's LogisticRegression(penalty='l1', C=C):

l1_lambda = 1 / (C × n_train)

Example: C=0.1 on a dataset of 38,000 training samples → l1_lambda ≈ 2.6e-4

Changes

pyhealth/models/logistic_regression.py
- Add l1_lambda: float = 0.0 to __init__
- In forward(): add l1_lambda * self.fc.weight.abs().sum() to loss
  when l1_lambda > 0
- Updated docstring with usage example and sklearn equivalence

Usage

from pyhealth.models import LogisticRegression

model = LogisticRegression(
    dataset=sample_dataset,
    embedding_dim=128,
    l1_lambda=2.6e-4,   # equivalent to sklearn C=0.1 with n_train=38,000
)

Testing

- Existing LogisticRegression tests pass unchanged (l1_lambda defaults to 0)
- With l1_lambda > 0, loss increases by the penalty term and gradients
flow correctly through fc.weight — verified via loss.backward()

Related PRs

This is PR 1 of 3 in a series implementing the Boag et al. 2018 mistrust
pipeline in PyHealth.
PRs #2 and #3 depend on this change.

Add optional l1_lambda parameter (default 0.0, fully backward-compatible) that appends a sparsity-inducing L1 penalty on the final linear layer's weights to the BCE loss during forward(): loss = BCE(logits, y_true) + l1_lambda * ||fc.weight||_1 This is equivalent to scikit-learn LogisticRegression(penalty='l1', C=C) with l1_lambda = 1 / (C * n_train), and reproduces the regularisation used in Boag et al. 2018 "Racial Disparities and Mistrust in End-of-Life Care" (MLHC 2018, arXiv:1808.03827) to train interpersonal-feature mistrust classifiers on MIMIC-III. Co-Authored-By: Varun Tewari <vtewari2@illinois.edu>

This was referenced Apr 11, 2026

feat(examples): Add end-to-end reproduction of Boag et al. 2018 mistrust pipeline on MIMIC-III #962

Open

feat: Reproduce Boag et al. 2018 medical mistrust pipeline in PyHealth #964

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(models): Add L1 regularization support to LogisticRegression#960

feat(models): Add L1 regularization support to LogisticRegression#960
vtewari2 wants to merge 1 commit intosunlabuiuc:masterfrom
vtewari2:pr/uiuccs598dlh/logistic-regression/l1-regularization

vtewari2 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vtewari2 commented Apr 11, 2026

Summary

Motivation

Parameter equivalence to scikit-learn

Changes

Usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant