The gene function prediction challenge: large language models and knowledge graphs to the rescue
Rohan Shawn Sunil, Shan Chun Lim, Manoj Itharajula, Marek Mutwil

TL;DR
This paper reviews how recent AI advances, including large language models and knowledge graphs, can significantly improve gene function prediction in plant science, addressing current limitations in experimental validation.
Contribution
It highlights the potential of integrating AI techniques like large language models and knowledge graphs to enhance gene function prediction methods.
Findings
AI methods can accelerate gene function annotation.
Knowledge graphs improve data integration for gene analysis.
Large language models assist in literature-based gene function inference.
Abstract
Elucidating gene function is one of the ultimate goals of plant science. Despite this, only ~15% of all genes in the model plant Arabidopsis thaliana have comprehensively experimentally verified functions. While bioinformatical gene function prediction approaches can guide biologists in their experimental efforts, neither the performance of the gene function prediction methods nor the number of experimental characterisation of genes has increased dramatically in recent years. In this review, we will discuss the status quo and the trajectory of gene function elucidation and outline the recent advances in gene function prediction approaches. We will then discuss how recent artificial intelligence advances in large language models and knowledge graphs can be leveraged to accelerate gene function predictions and keep us updated with scientific literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Machine Learning in Bioinformatics
