AnVILWorkflow: A runnable workflow package for Cloud-implemented bioinformatics analysis pipelines
Sehyun Oh, Kai Gravel-Pucillo, Marcel Ramos, Sean Davis, Vince Carey, Martin Morgan, Levi Waldron

TL;DR
AnVILWorkflow is an R package that simplifies running bioinformatics workflows on the AnVIL cloud platform, making it easier for researchers to analyze large-scale genomic data.
Contribution
The package introduces a user-friendly interface for executing AnVIL workflows from R, enabling seamless integration of R and non-R tools in cloud-based analyses.
Findings
AnVILWorkflow supports bulk RNA-seq, metagenomics, and digital pathology use cases with established tools.
The package reduces the complexity of cloud computing setup and data formatting for non-expert users.
It enables reproducible and scalable analysis pipelines without requiring direct cloud infrastructure management.
Abstract
Advancements in sequencing technologies and the development of new data collection methods produce large volumes of biological data. The Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) provides a cloud-based platform for democratizing access to large-scale genomics data and analysis tools. However, utilizing the full capabilities of AnVIL can be challenging for researchers without extensive bioinformatics expertise, especially for executing complex workflows. Here we present the AnVILWorkflow R package, which enables the convenient execution of bioinformatics workflows hosted on AnVIL directly from an R environment. AnVILWorkflowsimplifies the setup of the cloud computing environment, input data formatting, workflow submission, and retrieval of results through intuitive functions. We demonstrate the utility of AnVILWorkflowfor three use cases: bulk…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Genetics, Bioinformatics, and Biomedical Research · Research Data Management Practices
