ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

The Latest on X

Kidney Week

ASN / Education & Meetings / Kidney Week /

Please note that you are viewing an archived section from 2024 and some content may be unavailable. To unlock all content for 2024, please visit the archives.

Abstract: SA-PO005

Data Preprocessing: A Key Factor in Large Language Models' Performance in Critical Care Nephrology

Session Information

Augmented Intelligence, Large Language Models, and Digital Health
October 26, 2024 | Location: Exhibit Hall, Convention Center
Abstract Time: 10:00 AM - 12:00 PM

Category: Augmented Intelligence, Digital Health, and Data Science

300 Augmented Intelligence, Digital Health, and Data Science

Authors

Sheikh, M. Salman, Mayo Clinic Minnesota, Rochester, Minnesota, United States

Thongprayoon, Charat, Mayo Clinic Minnesota, Rochester, Minnesota, United States

Qureshi, Fawad, Mayo Clinic Minnesota, Rochester, Minnesota, United States

Miao, Jing, Mayo Clinic Minnesota, Rochester, Minnesota, United States

Craici, Iasmina, Mayo Clinic Minnesota, Rochester, Minnesota, United States

Kashani, Kianoush, Mayo Clinic Minnesota, Rochester, Minnesota, United States

Cheungpasitporn, Wisit, Mayo Clinic Minnesota, Rochester, Minnesota, United States

Background

In clinical evaluations, data is often encountered in the form of tables from outside sources. These tables can be processed as images or reformatted into text before being input into multimodal large language models (LLMs). The use of LLMs in medical education and decision-making is gaining traction. However, their accuracy in interpreting complex clinical data, particularly when presented as images rather than reformatted text, remains a concern. This study evaluates the impact of data formatting on the performance of ChatGPT-4 and Claude 3 Opus in answering critical care nephrology questions from the Kidney Self-Assessment Program (KSAP).

Methods

Fifty-six AKI and critical care nephrology questions from KSAP were reviewed, focusing on 46 questions that included tables with pertinent information such as laboratory values and diagnostic results. Initially, tables were inputted in an image-encoded format (screenshots), and the models' responses were recorded. Subsequently, the tables were reformatted into pure-text format, and the models were reassessed using the same questions. McNemar test assessed the statistical significance of the improvement in accuracy, and Cohen's Kappa test evaluated the agreement between pre-formatting and post-formatting answers for each model.

Results

In the initial run with tables in image-encoded format, ChatGPT-4 and Claude 3 Opus achieved accuracies of 50% and 43.5%, respectively. After reformatting from image-encoded format to pure-text based format, ChatGPT-4 and Claude 3 Opus' accuracies improved significantly to 73.9% and 60.87% (p<0.001), respectively. The Cohen's Kappa score for the agreement between GPT-4's pre-formatting and post-formatting answers is approximately 0.141, while the Cohen's Kappa score for the agreement between Claude 3 Opus's pre-formatting and post-formatting answers, after aligning the data for the same set of questions, is 0.350.

Conclusion

Data formatting significantly impacts the performance of LLMs in interpreting complex clinical data in critical care nephrology. Reformatting tables from image-encoded to pure-text format significantly improves the accuracy of ChatGPT-4 and Claude 3 Opus in answering KSAP questions. This highlights the importance of data preprocessing in optimizing LLM performance for clinical decision support.

ASN's Mission

Contact ASN

The Latest on X

Data Preprocessing: A Key Factor in Large Language Models' Performance in Critical Care Nephrology

Abstract: SA-PO005

Data Preprocessing: A Key Factor in Large Language Models' Performance in Critical Care Nephrology

Session Information

Category: Augmented Intelligence, Digital Health, and Data Science

Authors

M. Salman Sheikh, MD

Charat Thongprayoon, MD, FASN

Fawad Qureshi, MD, FASN

Jing Miao, MD, PhD, FASN

Iasmina Craici, MD

Kianoush Kashani, MD, MS, FASN

Wisit Cheungpasitporn, MD, FASN

Background

Methods

Results

Conclusion