Development and external validation of multimodal postoperative acute kidney injury risk machine learning models
Creators
- 1. University of Wisconsin-Madison
- 2. University of Chicago
- 3. Loyola University Chicago
Description
Objectives: To develop and externally validate machine learning models using structured and unstructured electronic health record data to predict postoperative acute kidney injury (AKI) across inpatient settings.
Materials and methods: Data for adult postoperative admissions to the Loyola University Medical Center (2009-2017) were used for model development and admissions to the University of Wisconsin-Madison (2009-2020) were used for validation. Structured features included demographics, vital signs, laboratory results, and nurse-documented scores. Unstructured text from clinical notes were converted into concept unique identifiers (CUIs) using the clinical Text Analysis and Knowledge Extraction System. The primary outcome was the development of Kidney Disease Improvement Global Outcomes stage 2 AKI within 7 days after leaving the operating room. We derived unimodal extreme gradient boosting machines (XGBoost) and elastic net logistic regression (GLMNET) models using structured-only data and multimodal models combining structured data with CUI features. Model comparison was performed using the receiver operating characteristic curve (AUROC), with Delong's test for statistical differences.
Results: The study cohort included 138 389 adult patient admissions (mean [SD] age 58 [16] years; 11 506 [8%] African-American; and 70 826 [51%] female) across the 2 sites. Of those, 2959 (2.1%) developed stage 2 AKI or higher. Across all data types, XGBoost outperformed GLMNET (mean AUROC 0.81 [95% confidence interval (CI), 0.80-0.82] vs 0.78 [95% CI, 0.77-0.79]). The multimodal XGBoost model incorporating CUIs parameterized as term frequency-inverse document frequency (TF-IDF) showed the highest discrimination performance (AUROC 0.82 [95% CI, 0.81-0.83]) over unimodal models (AUROC 0.79 [95% CI, 0.78-0.80]).
Discussion: A multimodality approach with structured data and TF-IDF weighting of CUIs increased model performance over structured data-only models.
Conclusion: These findings highlight the predictive power of CUIs when merged with structured data for clinical prediction models, which may improve the detection of postoperative AKI.
Data availability
The data utilized in this article cannot be shared publicly due to regulatory and legal restrictions. Our data were obtained from 2 hospital systems after our research protocol was reviewed by IRBs from each hospital, and our data use agreements do not permit sharing due to the granular nature of the data. Interested researchers can contact the corresponding author or Madeline Oguss for specific queries related to data sharing.Files
Development-and-external-validation.pdf
Files
(1.7 MB)
| Name | Size | Download all |
|---|---|---|
|
Article md5:b5ec048d9a54a8e13408a653ccf92c27 |
1.2 MB | Preview Download |
|
md5:b00621758f1e7a0ad7283a4c5d99c20e
|
508.3 kB | Preview Download |
Additional details
Identifiers
- DOI
- 10.1093/jamiaopen/ooad109
- Other
- oai:uchicago.tind.io:10254
Funding
- National Institute of Diabetes and Digestive and Kidney Diseases
- R01-DK126933