Abstract: PO0528
Predicting Rapid eGFR Decline Using Electronic Health Record (EHR) Data Despite High Missingness in the CURE-CKD Registry
Session Information
- CKD Health Services Research
October 22, 2020 | Location: On-Demand
Abstract Time: 10:00 AM - 12:00 PM
Category: CKD (Non-Dialysis)
- 2101 CKD (Non-Dialysis): Epidemiology, Risk Factors, and Prevention
Authors
- Davis, Tyler Austin, University of California Los Angeles, Los Angeles, California, United States
- Petousis, Panayiotis, University of California Los Angeles, Los Angeles, California, United States
- Zamanzadeh, Davina J., University of California Los Angeles, Los Angeles, California, United States
- Wang, Xiaoyan, University of California Los Angeles, Los Angeles, California, United States
- Norris, Keith C., University of California Los Angeles, Los Angeles, California, United States
- Duru, Obidiugwu, University of California Los Angeles, Los Angeles, California, United States
- Tuttle, Katherine R., Providence St Joseph Health, Spokane, Washington, United States
- Bui, Alex, University of California Los Angeles, Los Angeles, California, United States
- Nicholas, Susanne B., University of California Los Angeles, Los Angeles, California, United States
Group or Team Name
- CURE-CKD Registry Study Team
Background
Patients with rapid eGFR decline tend to progress to kidney failure. Automated tools can identify individuals at risk of severe renal function decline and facilitate disease mitigation. We describe a deep neural network (DNN) for predicting the risk of rapid eGFR decline (>40% decrease in eGFR over 2 years) and identified populations at higher risk of rapid decline using the CURE-CKD Registry.
Methods
Variables include: age, sex, race/ethnicity, ACE inhibitor/ARB use, eGFR, systolic blood pressure (SBP), hemoglobin A1C, and the diagnosis of hypertension, type 2 diabetes (DM), pre-DM or chronic kidney disease (CKD) based on EHR coding from patients with CKD (N=93,567) and at-risk for CKD (N=913,289) with eGFR ≥15ml/min/1.73m2 over 2 years. We trained and validated a 5-layer DNN, a logistic regression (LR) model, and a gradient boosted tree (GBT) model using a 60/20/20 train/test/validation split. We computed the risk distribution of all 25,475 subpopulations, based on all possible expert defined combinations of the above variables, and compared this risk distribution against the whole population’s risk distribution using the Kolmogorov-Smirnov (KS) test. Subgroups with the highest risk of decline were identified using the KS test (p<0.05) on our highest performing model.
Results
The DNN achieved an area under the receiver operating curve (AUC-ROC) of 0.75 on the test set. The LR and GBT achieved an AUC-ROC of 0.72 and 0.73, respectively. The subpopulations with significantly highest average predicted risk across training, validation, and testing were 17,734. We identified the most frequent predictors of rapid eGFR decline across the highest risk populations. Of the top 100 significantly higher risk subpopulations the following variables were the most frequent: CKD (100%), SBP > 140 mmHg (72%), age 45-66 years (56%), DM (52%), and A1C > 8 (50%).
Conclusion
We developed a methodology that uses a risk model for rapid eGFR decline using big data and used its predictions, along with the KS test, to identify subpopulations with significantly high risk for rapid eGFR decline.
Funding
- Other NIH Support