An Open Framework for Extensible Multi-Stage Bioinformatics Software
Gabriel Keeble-Gagn\`ere, Johan Nystr\"om-Persson, Matthew Bellgard,, Kenji Mizuguchi

TL;DR
This paper introduces Friedrich, an extensible bioinformatics framework built on Scala, enabling flexible, high-performance workflows suitable for both experimentation and batch processing, thus enhancing developer productivity.
Contribution
It presents a novel software framework that balances customisability and performance in bioinformatics workflows, using a multiparadigm language for flexibility.
Findings
Supports both experimentation and batch processing
Demonstrated with a genome assembler case study
Potential to increase bioinformatics software development productivity
Abstract
In research labs, there is often a need to customise software at every step in a given bioinformatics workflow, but traditionally it has been difficult to obtain both a high degree of customisability and good performance. Performance-sensitive tools are often highly monolithic, which can make research difficult. We present a novel set of software development principles and a bioinformatics framework, Friedrich, which is currently in early development. Friedrich applications support both early stage experimentation and late stage batch processing, since they simultaneously allow for good performance and a high degree of flexibility and customisability. These benefits are obtained in large part by basing Friedrich on the multiparadigm programming language Scala. We present a case study in the form of a basic genome assembler and its extension with new functionality. Our architecture has…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Genomics and Phylogenetic Studies · Distributed and Parallel Computing Systems
