feat(tasks): add medical mistrust tasks for MIMIC-III (Boag et al. 2018) by vtewari2 · Pull Request #961 · sunlabuiuc/PyHealth

vtewari2 · 2026-04-11T15:10:19Z

Description:

Summary

Adds two binary classification tasks and a helper function that implement
the computational mistrust proxies from:

Boag et al. "Racial Disparities and Mistrust in End-of-Life Care."
MLHC 2018. arXiv:1808.03827

Both tasks extract interpersonal interaction features from CHARTEVENTS
(structured, ~168 binary features covering agitation scales, restraints,
education readiness, family communication, pain assessments, etc.) and derive
binary labels from free-text NOTEEVENTS, one admission at a time.

Background

The paper identifies medical mistrust — a historically grounded institutional
skepticism prevalent in minority communities — as a primary driver of racial
disparities in aggressive end-of-life care. It quantifies mistrust through
three algorithmic proxies; this PR implements the two supervised classifiers:

Proxy	Label source	Signal
Noncompliance	`"noncompliant"` substring in any note	Active refusal of care
Autopsy consent	Consent/decline keywords in notes	Post-mortem distrust of care quality

Both use the same interpersonal_features input — a deduplicated sequence
of normalised CHARTEVENTS feature-key strings — compatible with
LogisticRegression (and any other PyHealth sequence model).

New: `pyhealth/tasks/mistrust_mimic3.py`

`build_interpersonal_itemids(d_items_path)`

Helper that reads D_ITEMS.csv.gz and returns {itemid: label} for all
CHARTEVENTS items whose label matches the ~40 interpersonal keywords from
the paper's trust.ipynb. Produces ~168 matched ITEMIDs on MIMIC-III v1.4.

from pyhealth.tasks import build_interpersonal_itemids                                                 

itemid_to_label = build_interpersonal_itemids("/path/to/D_ITEMS.csv.gz")                               
# {720: 'Ventilator Mode', 228096: 'Riker-SAS Scale', ...}  ~168 entries             
MistrustNoncomplianceMIMIC3                       

input_schema  = {"interpersonal_features": "sequence"}                                                 
output_schema = {"noncompliance": "binary"}       

- Label 1 if any NOTEEVENTS note for the admission contains "noncompliant",                            
else 0. Base rate ≈ 0.88 % in MIMIC-III v1.4.     
- All admissions with ≥1 interpersonal chartevents feature receive a label                             
(default 0 / trusting), mirroring the original paper's labelling strategy.                             

MistrustAutopsyMIMIC3                             

input_schema  = {"interpersonal_features": "sequence"}                                                 
output_schema = {"autopsy_consent": "binary"}     

- Label 1 (consent / mistrustful) if notes contain consent/agree/request                               
near "autopsy"; 0 (decline / trusting) for decline/refuse/denied.                                      
- Admissions where both signals appear are excluded as ambiguous.                                      
- Only admissions with an explicit autopsy signal receive a label (~1,009                              
in MIMIC-III v1.4; Black patients consent at ~39% vs ~26% for White).    
Feature normalisation                             

Both tasks apply the full normalisation pipeline from trust.ipynb cell 7:                              

┌──────────────────────────────────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐   
│                                      Label pattern                                       │                                                  Normalised to                                                  │ 
├──────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤   
│ reason for restraint                                                                     │ 6 buckets: none / threat of harm / confusion-delirium / presence of violence / treatment interference / risk    │ 
│                                                                                          │ for falls                                                                                                       │ 
├──────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤   
│ restraint location                                                                       │ none / 4 point restraint / some restraint                                                                       │   
├──────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤   
│ restraint device                                                                         │ sitter / limb / (raw)                                                                                           │   
├──────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ 
│ bath                                                                                     │ partial / self / refused / shave / hair / none / done                                                           │   
├──────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ 
│ behavior, behavioral state                                                               │ skipped                                                                                                         │   
├──────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ 
│ pain management/type/cause/location                                                      │ skipped                                                                                                         │   
├──────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ 
│ pain level*, education topic*, safety measures*, side rails*, status and comfort*,       │ kept as-is                                                                                                      │   
│ *informed*                                                                               │                                                                                                                 │ 
├──────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤   
│ all others                                                                               │ "label||value"                                                                                                  │ 
└──────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘   

Feature keys have the form "category||normalised_value" and are learned                                
into a vocabulary automatically by PyHealth's tokeniser during set_task().                            

Updated: pyhealth/tasks/__init__.py               

Exports MistrustNoncomplianceMIMIC3, MistrustAutopsyMIMIC3, and                                        
build_interpersonal_itemids.                      
Usage                                             

from pyhealth.datasets import MIMIC3Dataset       
from pyhealth.tasks import (                      
    MistrustNoncomplianceMIMIC3,                  
    MistrustAutopsyMIMIC3,                        
    build_interpersonal_itemids,                  
)                                                 
from pyhealth.models import LogisticRegression    

# Build itemid map from D_ITEMS                   
itemid_to_label = build_interpersonal_itemids("/path/to/D_ITEMS.csv.gz")                               

# Load dataset — requires CHARTEVENTS + NOTEEVENTS
base_dataset = MIMIC3Dataset(                     
    root="/path/to/mimic-iii/1.4",                
    tables=["CHARTEVENTS", "NOTEEVENTS"],         
)                                                 

# Noncompliance task                              
nc_dataset = base_dataset.set_task(               
    MistrustNoncomplianceMIMIC3(itemid_to_label=itemid_to_label)                                       
)                                                 

# Autopsy task                                    
au_dataset = base_dataset.set_task(               
    MistrustAutopsyMIMIC3(itemid_to_label=itemid_to_label)                                             
)                                                 

# Train with L1 regularisation (requires PR #1)   
model = LogisticRegression(dataset=nc_dataset, l1_lambda=2.6e-4)                                       

Dependencies                                      

- Requires PR #1 (l1_lambda in LogisticRegression) for                                                 
paper-equivalent training. The tasks themselves are model-agnostic and                                
work with any PyHealth sequence model.            
- MIMIC-III v1.4 with PhysioNet credentialed access.                                                   

Related PRs                                       

This is PR 2 of 3 in the Boag et al. 2018 mistrust pipeline series. 
- PR #960 pr/uiuccs598dlh/logistic-regression/l1-regularization ← merge first                           
- PR #962 pr/uiuccs598dlh/paper-pipeline/eol-mistrust-boag-2018 ← merge after this

Add two binary classification tasks and a helper that reproduce the interpersonal-feature mistrust classifiers from: Boag et al. "Racial Disparities and Mistrust in End-of-Life Care." MLHC 2018. arXiv:1808.03827 New file: pyhealth/tasks/mistrust_mimic3.py - build_interpersonal_itemids(d_items_path): reads D_ITEMS.csv.gz and returns {itemid: label} for ~168 interpersonal CHARTEVENTS items matched via keyword list from trust.ipynb. - MistrustNoncomplianceMIMIC3: predicts "noncompliant" label from NOTEEVENTS using interpersonal CHARTEVENTS features as a sequence input. Label 1 = noncompliant (mistrustful), 0 = compliant. - MistrustAutopsyMIMIC3: predicts autopsy consent from the same features. Label 1 = consent (mistrustful), 0 = decline (trusting). Admissions with both consent and decline signals are excluded. - Full feature normalisation mirroring trust.ipynb cell 7 (restraint coarsening, bath categories, skip rules for pain mgmt/type/cause). Updated: pyhealth/tasks/__init__.py - Export MistrustNoncomplianceMIMIC3, MistrustAutopsyMIMIC3, build_interpersonal_itemids. Co-Authored-By: Varun Tewari <vtewari2@illinois.edu>

This was referenced Apr 11, 2026

feat(examples): Add end-to-end reproduction of Boag et al. 2018 mistrust pipeline on MIMIC-III #962

Open

feat: Reproduce Boag et al. 2018 medical mistrust pipeline in PyHealth #964

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tasks): add medical mistrust tasks for MIMIC-III (Boag et al. 2018)#961

feat(tasks): add medical mistrust tasks for MIMIC-III (Boag et al. 2018)#961
vtewari2 wants to merge 1 commit intosunlabuiuc:masterfrom
vtewari2:pr/uiuccs598dlh/mistrust-tasks/interpersonal-features-mimic3

vtewari2 commented Apr 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vtewari2 commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

New: pyhealth/tasks/mistrust_mimic3.py

build_interpersonal_itemids(d_items_path)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vtewari2 commented Apr 11, 2026 •

edited

Loading

New: `pyhealth/tasks/mistrust_mimic3.py`

`build_interpersonal_itemids(d_items_path)`