π»Code implementation for "Real-time prediction of intensive care unit patient acuity and therapy requirements using state-space modelling" paper.
π Paper was accepted for publication at Nature Communications (πPDF).
APRICOT-Mamba is a deep learning framework designed to continuously predict patient acuity in the ICU using Electronic Health Records (EHR). It extends the APRICOT family by integrating Mamba-based state space models and Transformer architectures, enabling real-time, interpretable predictions of patient stability and transitions.
This repository includes:
- Data preprocessing pipelines for retrospective and prospective ICU cohorts.
- Training and evaluation scripts for APRICOT-Mamba, APRICOT-Transformer, GRU, CatBoost, and Transformer baselines.
- Post-hoc analysis tools for calibration, feature attribution, and prospective validation.
βββ README.md
βββ main/
βββ analyses/ # Post-training analyses (calibration, performance, etc.)
β βββ calibration/
β βββ confusion_matrix/
β βββ integrated_gradients/ # Feature importance analysis
β βββ ...
βββ baseline_models/ # Baseline models (CatBoost, GRU, Transformer)
β βββ catboost/
β βββ gru/
β βββ transformer/
βββ datasets/ # Data loading and description
β βββ README.md
β βββ eicu/
β βββ mimic/
β βββ uf/
βββ models/ # Core model implementations (APRICOT-Mamba, APRICOT-T)
β βββ apricotm/ # APRICOT-Mamba model
β βββ apricott/ # APRICOT-Transformer model
β βββ model_comparison.py
β βββ variables.py # Configuration variables
βββ prospective_cohort/ # Prospective cohort data processing
βββ retrospective_cohort/ # Retrospective cohort data processing
βββ sofa_baseline/ # SOFA score baseline calculation
βββ summary/ # Summary generation scripts
- Python β₯ 3.8
- Package Manager:
piporconda - Key Python Libraries:
pandasnumpyscikit-learnh5pytorch(PyTorch)optunacatboostcaptum
Install all dependencies with:
pip install -r requirements.txtNote: For GPU support with PyTorch, refer to the official installation guide.
- CPU: Multi-core processor
- RAM: β₯ 16GB
- GPU: NVIDIA GPU with CUDA support (recommended for training deep learning models)
This project utilizes EHR data from:
-
eICU Collaborative Research Database: A multi-center ICU database with high granularity data for over 200,000 admissions. Access requires credentialed approval.
-
MIMIC-IV: A large, freely accessible critical care database comprising de-identified health-related data associated with over 60,000 ICU admissions.
-
University of Florida Health (UFH): Internal EHR data from UF Health. Note: This dataset is not publicly available at this time.
Data processing scripts are located in:
main/datasets/main/retrospective_cohort/main/prospective_cohort/
The primary data format for training and evaluation is HDF5 (.h5). The script main/retrospective_cohort/5_build_hdf5.py demonstrates the structure of the final dataset.h5 file, which includes training, validation, external test, and temporal test sets with features (X), static data (static), and labels (y_main, y_trans).
Refer to main/datasets/README.md for detailed information on data sources and initial setup.
Process raw EHR data to generate the required dataset.h5 file:
python main/retrospective_cohort/5_build_hdf5.pyNote: Adjust paths and parameters as needed in the script.
Navigate to the desired model directory and run the training script:
cd main/models/apricotm/
python 1_train.pyThis script performs hyperparameter optimization using optuna, trains the model with PyTorch, and saves:
- Best hyperparameters:
best_params.pkl - Model weights:
apricotm_weights.pth - Model architecture:
apricotm_architecture.pth
Training duration is approximately 2 hours on an NVIDIA A100 GPU.
Repeat the process for other models as needed.
Evaluate the trained model on test sets:
python 2_eval.pyEvaluation results are saved in the results subdirectory within the model's directory.
If prospective data is prepared, apply the trained model:
python 3_prospective.pyPerform analyses on model predictions:
python main/analyses/calibration/1_calibration.py
python main/analyses/integrated_gradients/1_integrated_gradients_table.pyResults are generated under the user-defined home directory (HOME_DIR), time window (time_window), and model:
{HOME_DIR}/deepacu/main/{time_window}h_window/model/{model}/results
APRICOT-Mamba demonstrates high performance in predicting patient acuity, with AUROC scores comparable to state-of-the-art models. Detailed performance metrics, calibration plots, and feature importance analyses are available in the results directories and can be visualized using the provided analysis scripts.
We welcome contributions from the community! To contribute:
- Fork the repository.
- Create a new branch for your feature or bugfix.
- Commit your changes with clear messages.
- Submit a pull request detailing your changes.
This project is licensed under the GNU General Public License v3.0.
If you use this work in your research, please cite:
@article{contreras2025real,
author = {Miguel Contreras, Brandon Silva, Benjamin Shickel, Andrea Davidson, Tezcan Ozrazgat-Baslanti, Yuanfang Ren, Ziyuan Guan, Jeremy Balch, Jiaqing Zhang, Sabyasachi Bandyopadhyay, Tyler Loftus, Kia Khezeli, Gloria Lipori, Jessica Sena, Subhash Nerella, Azra Bihorac, Parisa Rashidi},
title = {Real-time prediction of intensive care unit patient acuity and therapy requirements using state-space modelling},
journal = {Nature Communications},
year = {2025},
month = {July},
doi = {10.1038/s41467-025-62121-1},
}
- Dr. Parisa Rashidi: [email protected]
