ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Please note that you are viewing an archived section from 2023 and some content may be unavailable. To unlock all content for 2023, please visit the archives.

Abstract: FR-PO116

Revolutionizing AKI and Critical Care Nephrology Education: Evaluating ChatGPT's Accuracy on Core Questions

Session Information

  • AKI: Outcomes, RRT
    November 03, 2023 | Location: Exhibit Hall, Pennsylvania Convention Center
    Abstract Time: 10:00 AM - 12:00 PM

Category: Acute Kidney Injury

  • 102 AKI: Clinical, Outcomes, and Trials

Authors

  • Sheikh, M. Salman, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Kashani, Kianoush, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Thongprayoon, Charat, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Qureshi, Fawad, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Domecq Garces, Juan Pablo, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Craici, Iasmina, Mayo Clinic Minnesota, Rochester, Minnesota, United States
  • Cheungpasitporn, Wisit, Mayo Clinic Minnesota, Rochester, Minnesota, United States
Background

ChatGPT is a state-of-the-art language model with exceptional proficiency in various natural language processing tasks, including generating responses that closely mimic human-generated ones. While there is growing speculation about ChatGPT's potential to serve as a substitute for physicians in clinical settings, its proficiency in nephrology, including acute kidney injury and critical care nephrology, remains uncertain. This study aims to evaluate the performance of ChatGPT in answering core questions related to acute kidney injury and critical care nephrology.

Methods

The accuracy of ChatGPT was evaluated in answering questions related to acute kidney injury and critical care nephrology using the Nephrology Self-Assessment Program (NephSAP) and Kidney Self-Assessment Program of the American Society of Nephrology (KSAP). Questions containing images were excluded from the assessment due to current limitations in ChatGPT's image processing capabilities. One hundred ten questions were included in the evaluation, 45 from NephSAP and 55 from KSAP. Each question bank was executed twice using ChatGPT. The level of concordance between the initial and subsequent runs, which were conducted two weeks apart, was also examined.

Results

In the case of NephSAP questions, ChatGPT achieved accuracies of 55% and 69% on the initial and subsequent runs, respectively. For KSAP questions, it achieved accuracies of 46% and 40%, respectively. ChatGPT's accuracy on all 110 questions combined was 52% and 51% for the initial and subsequent runs. The overall concordance between the initial and subsequent runs was 78%, with 86 questions (78%) receiving the same response and 24 (22%) receiving different responses. Correct concordance was 57%, and incorrect concordance was 43%. Among the 24 questions with divergent responses, ChatGPT rectified 11 incorrect responses to become correct. Conversely, it changed its response from correct to incorrect in 5 out of 24 questions.

Conclusion

Our study shows that ChatGPT only responded correctly to half of the questions related to acute kidney injury and critical care nephrology with low reliability. Therefore, ChatGPT as an educational tool may not be precise or reliable, and further development may be necessary to improve its performance.