Published December 22, 2023 | Version v1
Journal article Open

Development and external validation of multimodal postoperative acute kidney injury risk machine learning models

Description

Objectives: To develop and externally validate machine learning models using structured and unstructured electronic health record data to predict postoperative acute kidney injury (AKI) across inpatient settings.

Materials and methods: Data for adult postoperative admissions to the Loyola University Medical Center (2009-2017) were used for model development and admissions to the University of Wisconsin-Madison (2009-2020) were used for validation. Structured features included demographics, vital signs, laboratory results, and nurse-documented scores. Unstructured text from clinical notes were converted into concept unique identifiers (CUIs) using the clinical Text Analysis and Knowledge Extraction System. The primary outcome was the development of Kidney Disease Improvement Global Outcomes stage 2 AKI within 7 days after leaving the operating room. We derived unimodal extreme gradient boosting machines (XGBoost) and elastic net logistic regression (GLMNET) models using structured-only data and multimodal models combining structured data with CUI features. Model comparison was performed using the receiver operating characteristic curve (AUROC), with Delong's test for statistical differences.

Results: The study cohort included 138 389 adult patient admissions (mean [SD] age 58 [16] years; 11 506 [8%] African-American; and 70 826 [51%] female) across the 2 sites. Of those, 2959 (2.1%) developed stage 2 AKI or higher. Across all data types, XGBoost outperformed GLMNET (mean AUROC 0.81 [95% confidence interval (CI), 0.80-0.82] vs 0.78 [95% CI, 0.77-0.79]). The multimodal XGBoost model incorporating CUIs parameterized as term frequency-inverse document frequency (TF-IDF) showed the highest discrimination performance (AUROC 0.82 [95% CI, 0.81-0.83]) over unimodal models (AUROC 0.79 [95% CI, 0.78-0.80]).

Discussion: A multimodality approach with structured data and TF-IDF weighting of CUIs increased model performance over structured data-only models.

Conclusion: These findings highlight the predictive power of CUIs when merged with structured data for clinical prediction models, which may improve the detection of postoperative AKI.

Data availability

The data utilized in this article cannot be shared publicly due to regulatory and legal restrictions. Our data were obtained from 2 hospital systems after our research protocol was reviewed by IRBs from each hospital, and our data use agreements do not permit sharing due to the granular nature of the data. Interested researchers can contact the corresponding author or Madeline Oguss for specific queries related to data sharing.

Files

Development-and-external-validation.pdf

Files (1.7 MB)

Name Size Download all
Article
md5:b5ec048d9a54a8e13408a653ccf92c27
1.2 MB Preview Download
md5:b00621758f1e7a0ad7283a4c5d99c20e
508.3 kB Preview Download

Additional details

Identifiers

DOI
10.1093/jamiaopen/ooad109
Other
oai:uchicago.tind.io:10254

Funding

National Institute of Diabetes and Digestive and Kidney Diseases
R01-DK126933

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Medicine