ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

The Latest on X

Kidney Week

ASN / Education & Meetings / Kidney Week /

Please note that you are viewing an archived section from 2024 and some content may be unavailable. To unlock all content for 2024, please visit the archives.

Abstract: SA-PO011

Natural Language Processing for Extracting Kidney Biopsy Pathology Diagnoses: The Houston Methodist Hospital Kidney Biopsy Registry

Session Information

Augmented Intelligence, Large Language Models, and Digital Health
October 26, 2024 | Location: Exhibit Hall, Convention Center
Abstract Time: 10:00 AM - 12:00 PM

Category: Augmented Intelligence, Digital Health, and Data Science

300 Augmented Intelligence, Digital Health, and Data Science

Authors

Bobart, Shane A., Houston Methodist Hospital, Houston, Texas, United States

Hsu, Enshuo, Houston Methodist Hospital, Houston, Texas, United States

Truong, Luan D., Houston Methodist Hospital, Houston, Texas, United States

Waterman, Amy D., Houston Methodist Hospital, Houston, Texas, United States

Jones, Stephen L., Houston Methodist Hospital, Houston, Texas, United States

Shafi, Tariq, Houston Methodist Hospital, Houston, Texas, United States

Background

Kidney biopsy reports provide detailed description of kidney pathology, but the diagnosis is not captured as searchable, discrete data in electronic health records (EHR) requiring labor-intensive manual review and abstraction. We sought to use Natural Language Processing (NLP) to extract kidney biopsy pathology diagnoses as an initial step to create an automatically updated Houston Methodist Hospital Kidney Biopsy Registry (HM-KBR).

Methods

We identified 3,087 native kidney biopsies (2,700 patients) from June 2016 to December 2023. We extracted 1000 native kidney biopsy reports in PDF format from the Epic EHR. A domain expert (SAB) manually annotated the primary diagnosis in the 1000 reports and a renal pathologist (LT) validated 20% (n=200). We processed the PDFs into machine-readable free text with SQL server (database management software) and Python (programming language). We split the biopsy reports into a training set (80%) and used the bidirectional encoder representations from transformers (BERT) NLP model to extract primary diagnoses. We evaluated the NLP model performance in a stand-alone test set (20%).
The evaluation metrics included precision (false positive rate), recall (false negative rate), F1 score (harmonic mean of precision and recall, ranging 0 to 1, with 1 implying perfect model performance) and AUROC (overall performance, 1.0 is best). Due to the preliminary size of the training set, we limited the diagnosis types to those present in at least 20 reports for the evaluation metrics.

Results

The median age was 57 years, 50% were female, 28% Black and 23% Hispanic. The agreement between the two reviewers in the validation sample was assessed by Cohen’s kappa statistic (0.76; excellent). The NLP extracted diagnoses showed an F1 score of 0.66 and AUROC of 0.93 (Table 1).

Conclusion

Our preliminary data shows an accurate and scalable NLP based model to extract the primary diagnosis from free text kidney biopsy pathology reports.

Digital Object Identifier (DOI)

doi: 10.1681/ASN.2024k0tkb26s

ASN's Mission

Contact ASN

The Latest on X

Natural Language Processing for Extracting Kidney Biopsy Pathology Diagnoses: The Houston Methodist Hospital Kidney Biopsy Registry

Abstract: SA-PO011

Natural Language Processing for Extracting Kidney Biopsy Pathology Diagnoses: The Houston Methodist Hospital Kidney Biopsy Registry

Session Information

Category: Augmented Intelligence, Digital Health, and Data Science

Authors

Shane A. Bobart, MD, FASN

Enshuo Hsu, MS

Luan D. Truong, MD

Amy D. Waterman, PhD

Stephen L. Jones, MD

Tariq Shafi, MBBS, MHS, FASN

Background

Methods

Results

Conclusion

Digital Object Identifier (DOI)