Reevaluating Assembly Evaluations with Feature Response Curves: GAGE and Assemblathons
Francesco Vezzi, Giuseppe Narzisi, Bud Mishra

TL;DR
This paper introduces an improved evaluation method for genome assemblies using feature response curves, enabling assessment without a reference genome, thus broadening the scope of assembly quality analysis.
Contribution
It extends the FRCurve approach to work with obscured layout information, allowing evaluation of a wider range of assemblers without needing a reference genome.
Findings
FRCbam effectively evaluates assemblies without reference genomes.
The extended FRCurve approach applies to deBruijn-graph-based algorithms.
Reevaluation of assembly competitions highlights the method's utility.
Abstract
In just the last decade, a multitude of bio-technologies and software pipelines have emerged to revolutionize genomics. To further their central goal, they aim to accelerate and improve the quality of de novo whole-genome assembly starting from short DNA reads. However, the performance of each of these tools is contingent on the length and quality of the sequencing data, the structure and complexity of the genome sequence, and the resolution and quality of long-range information. Furthermore, in the absence of any metric that captures the most fundamental "features" of a high-quality assembly, there is no obvious recipe for users to select the most desirable assembler/assembly. International competitions such as Assemblathons or GAGE tried to identify the best assembler(s) and their features. Some what circuitously, the only available approach to gauge de novo assemblies and assemblers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
