-
Notifications
You must be signed in to change notification settings - Fork 559
Add/dka task #749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add/dka task #749
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request introduces a new DKA (Diabetic Ketoacidosis) prediction task for the MIMIC-IV dataset. The PR adds two task classes (DKAPredictionMIMIC4 for general population and T1DDKAPredictionMIMIC4 for Type 1 Diabetes patients), comprehensive synthetic test data, example scripts, and documentation updates.
Key Changes:
- Added DKA prediction task implementation with temporal data leakage prevention
- Created synthetic MIMIC-IV demo dataset with realistic medical codes and lab values
- Provided example scripts demonstrating StageNet model training for DKA prediction
Critical Issues Identified:
The test file tests/core/test_mimic4_dka.py contains multiple critical bugs where it tests DKAPredictionMIMIC4 but uses attributes, methods, and parameters that only exist in T1DDKAPredictionMIMIC4. This suggests the test was written for the wrong class or copied from T1D tests without proper adaptation.
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 20 comments.
Show a summary per file
| File | Description |
|---|---|
| pyhealth/tasks/dka.py | Implements DKAPredictionMIMIC4 and T1DDKAPredictionMIMIC4 classes with ICD code classification, lab feature extraction, and temporal filtering to prevent data leakage |
| pyhealth/tasks/init.py | Exports the new DKA prediction task classes |
| pyhealth/datasets/configs/mimic4_ehr.yaml | Adds hadm_id field to labevents table configuration for admission-level filtering |
| tests/core/test_mimic4_dka.py | Test suite for DKA prediction (contains critical bugs - tests wrong class attributes/methods) |
| tests/core/test_mimic4_los.py | Test suite for length of stay prediction task |
| test-resources/core/mimic4demo/hosp/*.csv | Synthetic MIMIC-IV demo data files (patients, admissions, diagnoses, procedures, prescriptions, lab events) |
| examples/clinical_tasks/dka_mimic4.py | Example script for general population DKA prediction using StageNet |
| examples/clinical_tasks/t1dka_mimic4.py | Example script for T1D-specific DKA prediction using StageNet |
| examples/benchmark_perf/benchmark_workers_12.py | Benchmarking script for mortality prediction (has num_workers documentation inconsistencies) |
| docs/api/tasks/pyhealth.tasks.dka.rst | API documentation for DKA task |
| docs/api/tasks.rst | Adds DKA task to the task list |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This pull request introduces a new DKA (Diabetic Ketoacidosis) prediction task for the MIMIC-IV dataset, including its integration into the codebase, documentation, and example usage. It also adds comprehensive synthetic test resources for MIMIC-IV, supporting the new task and facilitating robust testing and benchmarking.
New DKA Prediction Task:
DKAPredictionMIMIC4class for DKA prediction on MIMIC-IV, and integrated it into thepyhealth.taskspackage.Examples and Benchmarks:
dka_mimic4.pydemonstrating how to use StageNet for DKA prediction with MIMIC-IV, including data loading, task application, model training, and evaluation.benchmark_workers_12.pyfor evaluating MIMIC-IV mortality prediction performance with various metrics and memory usage tracking.Test Resources:
patients.csv,admissions.csv,diagnoses_icd.csv,labevents.csv,d_labitems.csv) to support testing and development of the new task. [1] [2] [3] [4] [5]These changes collectively enable DKA prediction research on MIMIC-IV within the pyhealth framework and provide the necessary infrastructure for both development and robust evaluation.