BioAgents: Democratizing Bioinformatics Analysis with Multi-Agent   Systems

Nikita Mehandru; Amanda K. Hall; Olesya Melnichenko; Yulia Dubinina,; Daniel Tsirulnikov; David Bamman; Ahmed Alaa; Scott Saponas; Venkat S.; Malladi

arXiv:2501.06314·cs.AI·January 14, 2025

BioAgents: Democratizing Bioinformatics Analysis with Multi-Agent Systems

Nikita Mehandru, Amanda K. Hall, Olesya Melnichenko, Yulia Dubinina,, Daniel Tsirulnikov, David Bamman, Ahmed Alaa, Scott Saponas, Venkat S., Malladi

PDF

TL;DR

BioAgents is a multi-agent system built on fine-tuned small language models with retrieval augmentation, enabling personalized, local bioinformatics analysis comparable to human experts, addressing limitations of large language models.

Contribution

The paper introduces BioAgents, a novel multi-agent system using small, fine-tuned language models with retrieval augmentation for bioinformatics workflows.

Findings

01

Performance comparable to human experts on genomics tasks

02

Enables local operation and personalization with proprietary data

03

Addresses computational and expertise barriers in bioinformatics

Abstract

Creating end-to-end bioinformatics workflows requires diverse domain expertise, which poses challenges for both junior and senior researchers as it demands a deep understanding of both genomics concepts and computational techniques. While large language models (LLMs) provide some assistance, they often fall short in providing the nuanced guidance needed to execute complex bioinformatics tasks, and require expensive computing resources to achieve high performance. We thus propose a multi-agent system built on small language models, fine-tuned on bioinformatics data, and enhanced with retrieval augmented generation (RAG). Our system, BioAgents, enables local operation and personalization using proprietary data. We observe performance comparable to human experts on conceptual genomics tasks, and suggest next steps to enhance code generation capabilities.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.