Objectives To develop and validate a machine learning (ML) model for real-time monitoring of in-hospital mortality (IHM) risk and identification of major risk factors in heart failure (HF) patients admitted to intensive care units (ICUs).
Methods Data from ICU HF patients were extracted from the multicenter eICU-Collaborative Research Database (eICU-CRD. External validation used MIMIC-IV and a real-world Chinese dataset (CHN-dataset). Daily measurements from MIMIC-IV patients staying ≥ 3 days formed a Daily Measurement (DM) dataset. After rigorous preprocessing and feature selection, five ML algorithms were trained and optimized using eICU-CRD data. Model performance was evaluated using AUC, sensitivity, specificity, and balanced accuracy. The optimal model was benchmarked against APACHE and SOFA scores. SHapley Additive exPlanations (SHAP) interpreted feature contributions. A Windows application was developed for clinical deployment.
Results XGBoost emerged as the optimal model (Final-ML model), which incorporated only 17 routinely collected clinical variables: age, non-invasive systolic blood pressure (NI-SBP), heart rate, respiratory rate, Glasgow Coma Scale eye opening score, white blood cell count (WBC), creatinine, bicarbonate, red cell distribution width (RDW), platelet count, glucose, calcium, mean corpuscular hemoglobin concentration (MCHC), sodium, mean corpuscular volume, red blood cell count, and potassium. It achieved high AUCs: 0.876 (95%CI: 0.836-0.915; eICU-CRD test data), 0.932 (95%CI: 0.921-0.942; MIMIC-IV), and 0.879 (95%CI: 0.846-0.912; CHN-dataset). It significantly outperformed APACHE (AUC = 0.740, 95%CI: 0.720-0.761) and SOFA (AUC = 0.717, 95%CI: 0.694-0.740) scores. The model demonstrated strong generalizability across ethnicities, ward types, and genders within MIMIC-IV. Using daily data (DM dataset), predicted IHM risk accurately tracked patient trajectories: risk decreased progressively for survivors and increased for non-survivors throughout the ICU stay. SHAP analysis identified key predictors: NI-SBP, age, heart rate, WBC, glucose, and notably, RDW and MCHC. Time-dependent Cox regression confirmed RDW increase (HR = 3.783, 95%CI: 2.237-6.398) and MCHC decrease (HR = 0.173, 95%CI: 0.040-0.741) as significant independent risk factors for IHM.
Conclusions The developed XGBoost model provides a reliable, generalizable tool for real-time IHM risk quantification and monitoring in ICU HF patients, using only 17 routinely collected clinical variables. It surpasses traditional scoring systems and enables dynamic risk assessment throughout the ICU stay. By identifying patient-specific major risk factors via SHAP values, the model facilitates timely, personalized treatment adjustments.