CurateGPT: A flexible language-model assisted biocuration tool
Harry Caufield, Carlo Kroll, Shawn T O'Neil, Justin T Reese, Marcin P, Joachimiak, Harshad Hegde, Nomi L Harris, Madan Krishnamurthy, James A, McLaughlin, Damian Smedley, Melissa A Haendel, Peter N Robinson, Christopher, J Mungall

TL;DR
CurateGPT is a novel AI-assisted biocuration tool that combines generative language models with knowledge bases to improve the efficiency, accuracy, and scalability of biomedical data curation workflows.
Contribution
It introduces a flexible agent-based system that integrates LLMs with external sources, enhancing biocuration processes beyond traditional methods.
Findings
Increases curation efficiency and accuracy.
Provides direct links to supporting data.
Enables scaling of curation efforts.
Abstract
Effective data-driven biomedical discovery requires data curation: a time-consuming process of finding, organizing, distilling, integrating, interpreting, annotating, and validating diverse information into a structured form suitable for databases and knowledge bases. Accurate and efficient curation of these digital assets is critical to ensuring that they are FAIR, trustworthy, and sustainable. Unfortunately, expert curators face significant time and resource constraints. The rapid pace of new information being published daily is exceeding their capacity for curation. Generative AI, exemplified by instruction-tuned large language models (LLMs), has opened up new possibilities for assisting human-driven curation. The design philosophy of agents combines the emerging abilities of generative AI with more precise methods. A curator's tasks can be aided by agents for performing reasoning,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Artificial Intelligence in Healthcare and Education
