Compendium Manager: a tool for coordination of workflow management instances for bulk data processing in Python
Richard J. Abdill, Ran Blekhman

TL;DR
Compendium Manager is a Python command-line tool designed to automate, monitor, and evaluate large-scale bioinformatics workflows, facilitating bulk data processing and reproducibility across multiple projects.
Contribution
It introduces a lightweight tool that streamlines launching, monitoring, and recording metrics for numerous pipelines simultaneously, enhancing workflow management at scale.
Findings
Enables launching and monitoring hundreds of pipelines efficiently
Supports loading results into a shared database for analysis
Records detailed processing metrics for reproducibility
Abstract
Compendium Manager is a command-line tool written in Python to automate the provisioning, launch, and evaluation of bioinformatics pipelines. Although workflow management tools such as Snakemake and Nextflow enable users to automate the processing of samples within a single sequencing project, integrating many datasets in bulk requires launching and monitoring hundreds or thousands of pipelines. We present the Compendium Manager, a lightweight command-line tool to enable launching and monitoring analysis pipelines at scale. The tool can gauge progress through a list of projects, load results into a shared database, and record detailed processing metrics for later evaluation and reproducibility.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Genetics, Bioinformatics, and Biomedical Research · Computational Physics and Python Applications
