Predicting colorectal surgical complications using heterogeneous clinical data and kernel methods
Abstract
Objective: In this work, we have developed a learning system capable of exploiting information conveyed by longitudinal Electronic Health Records (EHRs) for the prediction of a common postoperative complication, Anastomosis Leakage (AL), in a data-driven way and by fusing temporal population data from different and heterogeneous sources in the EHRs. Material and methods: We used linear and non-linear kernel methods individually for each data source, and leveraging the powerful multiple kernels for their effective combination. To validate the system, we used data from the EHR of the gastrointestinal department at a university hospital. Results: We first investigated the early prediction performance from each data source separately, by computing Area Under the Curve values for processed free text (0.83), blood tests (0.74), and vital signs (0.65), respectively. When exploiting the heterogeneous data sources combined using the composite kernel framework, the prediction capabilities increased considerably (0.92). Finally, posterior probabilities were evaluated for risk assessment of patients as an aid for clinicians to raise alertness at an early stage, in order to act promptly for avoiding AL complications. Discussion: Machine-learning statistical model from EHR data can be useful to predict surgical complications. The combination of EHR extracted free text, blood samples values, and patient vital signs, improves the model performance. These results can be used as a framework for preoperative clinical decision support.