Automated Bioinformatics Analysis via AutoBA
Juexiao Zhou, Bin Zhang, Xiuying Chen, Haoyang Li, Xiaopeng Xu, Siyuan, Chen, Xin Gao

TL;DR
AutoBA is an autonomous AI tool based on large language models that simplifies and adapts complex omics data analysis processes with minimal user input, ensuring robustness, versatility, and data privacy.
Contribution
This paper introduces AutoBA, a novel AI agent that autonomously designs and executes bioinformatics analysis pipelines tailored to diverse omics data types.
Findings
Validated across multiple omics data types including WGS, RNA-seq, and spatial transcriptomics.
Demonstrates robustness and adaptability confirmed by expert bioinformaticians.
Operates locally to ensure data privacy and integrates emerging bioinformatics tools.
Abstract
With the fast-growing and evolving omics data, the demand for streamlined and adaptable tools to handle the analysis continues to grow. In response to this need, we introduce Auto Bioinformatics Analysis (AutoBA), an autonomous AI agent based on a large language model designed explicitly for conventional omics data analysis. AutoBA simplifies the analytical process by requiring minimal user input while delivering detailed step-by-step plans for various bioinformatics tasks. Through rigorous validation by expert bioinformaticians, AutoBA's robustness and adaptability are affirmed across a diverse range of omics analysis cases, including whole genome sequencing (WGS), RNA sequencing (RNA-seq), single-cell RNA-seq, ChIP-seq, and spatial transcriptomics. AutoBA's unique capacity to self-design analysis processes based on input data variations further underscores its versatility. Compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Scientific Computing and Data Management · Cancer Genomics and Diagnostics
