# FFC: a scalable FASTA compressor

**Authors:** Szymon Grabowski, Tomasz M Kowalski, Robert Susik

PMC · DOI: 10.1093/bioinformatics/btag132 · Bioinformatics · 2026-03-23

## TL;DR

FFC is a new FASTA compressor that is much faster than existing tools while maintaining good compression ratios.

## Contribution

FFC introduces a scalable FASTA compressor with significantly higher compression and decompression speeds compared to existing tools.

## Key findings

- FFC achieves 4.7× and 11.4× faster compression than zstd and NAF, respectively.
- Decompression is 3.5× and 2.7× faster than zstd and NAF, respectively.
- pzstd's compression ratio is 23% worse than FFC's despite similar speed.

## Abstract

FASTA is a widely used text-based format for storing nucleotide and protein sequences. The existing FASTA compressors usually focus on (slightly) improving the compression ratio, not on practical performance. We present FFC, a scalable FASTA compressor that achieves average compression speeds 4.7× and 11.4× higher than two high-performance compressors, zstd and NAF, respectively, across a benchmark set of seven single genomes. It also delivers average decompression speeds 3.5× and 2.7× higher than zstd and NAF, respectively. Although a chunk-based zstd variant with parallel decompression, pzstd, almost matches FFC speed, its compression ratio is on average by 23% worse than FFC’s. For the experiment, a 14-core workstation and a RAM disk (to reduce the impact of I/O) were used.

FFC is freely available at github.com/kowallus/ffc and also as a Zenodo repository at 10.5281/zenodo.18892353, and the used datasets at 10.5281/zenodo.18873744.

## Full-text entities

- **Genes:** CXCL8 (C-X-C motif chemokine ligand 8) [NCBI Gene 3576] {aka GCP-1, GCP1, IL8, LECT, LUCT, LYNAP}
- **Diseases:** influenza (MESH:D007251), COVID-19 (MESH:D000086382)
- **Chemicals:** RAM- (MESH:C071315), FASTA (-), amino acid (MESH:D000596)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090], Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13038249/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13038249/full.md

## References

10 references — full list in the complete paper: https://tomesphere.com/paper/PMC13038249/full.md

---
Source: https://tomesphere.com/paper/PMC13038249