Multimodal Integration of Ambulatory ECG and Clinical Features for Sudden Cardiac Death and Pump Failure Death Prediction
Steven Swee, Irsyad Adam, Erika Yilin Zheng, Ethan Ji, Ding Wang, William Speier, Jeffrey Hsu, Kai-Wei Chang, Kalyanam Shivkumar, Peipei Ping
Ambulatory electrocardiograms (ECG) provides continuous monitoring of the heart's electrical activity. However, many existing machine learning and artificial intelligence models for analyzing ambulatory ECG traces are often unimodal and do not incorporate patient clinical context. In this study, we propose a multimodal framework integrating ambulatory ECG-derived representations with clinical text embeddings to predict two cardiac outcomes: sudden cardiac death and pump failure death. Ambulatory ECG traces are preprocessed, segmented, and encoded via a multiple instance learning and temporal convolutional neural network framework. In parallel, patient clinical features are parsed into structured prompts, which are passed through a large language model to generate clinical reasoning; this reasoning passes through a biomedical language encoder to generate a text embedding. With the ECG and text embeddings, we systematically evaluate multiple fusion strategies, including concatenation- and gating-based approaches, to integrate these two data modalities. Our results demonstrate that multimodal models consistently outperform unimodal baselines, with adaptive fusion mechanisms providing the greatest improvements in predictive performance. Decision curve analysis highlights the potential clinical utility of the proposed framework for risk stratification. Finally, we visualize model attention across modalities, including ECG attention patterns, segment-level saliency, heart rate variability features, and clinical reasoning, to contextualize patient-specific predictions.
Read on ELI