Abstract: FR-PO023
Prediction of Individual Educational Attainment in ESKD Patients Using Zip Code-Derived Measures
Session Information
- AI, Digital Health, Data Science - II
November 03, 2023 | Location: Exhibit Hall, Pennsylvania Convention Center
Abstract Time: 10:00 AM - 12:00 PM
Category: Augmented Intelligence, Digital Health, and Data Science
- 300 Augmented Intelligence, Digital Health, and Data Science
Authors
- Takkavatakarn, Kullaya, Icahn School of Medicine at Mount Sinai, New York, New York, United States
- Dai, Yang, Icahn School of Medicine at Mount Sinai, New York, New York, United States
- Wen, Huei Hsun, Icahn School of Medicine at Mount Sinai, New York, New York, United States
- Nadkarni, Girish N., Icahn School of Medicine at Mount Sinai, New York, New York, United States
- Chan, Lili, Icahn School of Medicine at Mount Sinai, New York, New York, United States
Background
Social determinants of health (SDOH) are associated with various health outcomes. Area-level SDOHs based on patients' zip codes or census tracts have been commonly used in research instead of individual SDOH. Previous work showed that zip code-derived SDOH measures were inaccurate in highly heterogeneous urban neighborhoods. Therefore, we aimed to predict individual SDOH by using machine learning.
Methods
We used data from ESKD patients ≥ 25 years old enrolled in two studies at Mount Sinai in NY. All patients completed a questionnaire regarding the highest level of education, age, gender, and race/ethnicity. We used data from the American Community Survey to achieve the zip code-derived education based on the patient’s zip code, gender, and race/ethnicity. We tested several machine-learning algorithms, including Naïve Bayes, decision tree, and random forest (RF). We then developed three multi-class prediction models to predict individual educational attainment. Model 1 used only zip code-derived education. Model 2 included model 1 + demographic variables and comorbidity. Model 3 included model 2 + neighborhood SDOHs (GINI and dissimilarity indices). The cohort was divided into 75/25 training and test sets, and 5-fold cross-validation was employed.
Results
A total of 603 ESKD patients were identified. The mean age was 58±12 years, 55% of patients attained less than high school, 32% completed high school, and 13% had a bachelor’s degree or higher. Only 31% of zip code-derived education accurately matched actual education. The RF model has the best overall performance. Using RF, model 1 enhanced accuracy to 51% with an AUROC of 0.62 (95%CI 0.53 to 0.67). Model 3 demonstrated the highest accuracy (59%) and AUROC (0.71, 95%CI 0.63 to 0.77) (Figure 1).
Conclusion
Combining zip code-derived educational attainment with demographic data and neighborhood SDOH measures can improve the prediction of individual education in ESKD patients. This may improve the performance of models that incorporate SDOH as a feature.
Funding
- NIDDK Support