ASN's Mission

To create a world without kidney diseases, the ASN Alliance for Kidney Health elevates care by educating and informing, driving breakthroughs and innovation, and advocating for policies that create transformative changes in kidney medicine throughout the world.

learn more

Contact ASN

1401 H St, NW, Ste 900, Washington, DC 20005

email@asn-online.org

202-640-4660

The Latest on X

Kidney Week

Abstract: TH-PO031

StarFunc: Accurate Protein Function Prediction Reveals Novel Human Proteins Involved in Ubiquitination

Session Information

Category: Augmented Intelligence, Digital Health, and Data Science

  • 300 Augmented Intelligence, Digital Health, and Data Science

Authors

  • Zhang, Chengxin, University of Michigan, Ann Arbor, Michigan, United States
  • Freddolino, Lydia, University of Michigan, Ann Arbor, Michigan, United States
Background

Even in the very well-studied human proteome, many proteins remain poorly annotated, yet may still make important contributions to health and disease. Deep learning has significantly advanced the development of novel methods for protein function prediction. Yet, even for state-of-the-art deep learning approaches, template information remains an indispensable component. While many prediction methods use templates identified through sequence homology or protein-protein interactions, very few methods detect templates through structural similarity, even though protein structures are the basis of their functions.

Methods

In this work, we developed StarFunc, a composite approach that integrates state-of-the-art deep learning models seamlessly with template information from structural similarity, sequence homology, protein-protein interaction partners, and protein domain families (Fig. 1).

Results

We compared the accuracy of StarFunc against 6 existing deep learning methods and 3 template-based methods on 2475 proteins. The weighted F-measure of StarFunc is 12% higher than the second-best approach. StarFunc participated in the Critical Assessment of Function Annotation 5 (CAFA5) challenge and was ranked 5th among 1625 teams from 96 countries. We applied StarFunc on all 20389 proteins from the human reference proteome curated by the neXtProt and identified significant enrichments of several important functions among the set of currently cryptic human proteins. For example, we discovered 15 uncharacterized proteins that are likely components of protein-ubiquitin transferase complexes and 1 putative deubiquitinase.

Conclusion

Large-scale benchmark demonstrates StarFunc's advantage when compared to both deep learning methods and conventional template-based predictors. Application of StarFunc on human proteome reveals novel functions of previously uncharacterized proteins, especially those involved in (de)ubiquitination, providing an entry point for studying fundamental new biology involving those proteins.

Fig. 1. Flowchart of StarFunc.

Funding

  • Other NIH Support