# Alpseq: an open-source workflow to turbocharge nanobody discovery with high-throughput sequencing

**Authors:** Kathleen Zeglinski, Jakob Schuster, Jaison D Sa, Amy Adair, Jing Deng, Phillip Pymm, Matthew E. Ritchie, Rory Bowden, Wai-Hong Tham, Quentin Gouil

PMC · DOI: 10.1080/19420862.2026.2623326 · mAbs · 2026-02-03

## TL;DR

Alpseq is an open-source tool that simplifies and accelerates the analysis of nanobody data using high-throughput sequencing.

## Contribution

Alpseq introduces an optimized, user-friendly pipeline for processing and analyzing nanobody NGS data with a PCR-free protocol.

## Key findings

- Alpseq provides a two-part pipeline for efficient processing and analysis of nanobody sequencing data.
- The tool supports sophisticated panning designs and generates lead candidates for experimental validation.
- Alpseq includes a pre-processing module and an analysis module with quality control and clustering functions.

## Abstract

Nanobodies have emerged as promising tools for many biotechnological applications due to their small size, high stability and remarkable binding specificity. Next-Generation Sequencing (NGS) enables deep profiling of large nanobody libraries and panning campaigns; however, the scale and diversity of nanobody NGS datasets presents a significant bioinformatic challenge. To this end, we have developed alpseq, an optimized, open-source software pipeline designed specifically for the efficient and accurate processing of NGS data from nanobody libraries and panning campaigns. alpseq is also paired with a PCR-free sequencing library preparation protocol to allow researchers to easily generate their own data while avoiding biases. The alpseq software pipeline is composed of two parts: a pre-processing module written in Nextflow efficiently handles raw nanobody reads in a single line of code. These results are then fed into the analysis module, which contains a comprehensive suite of functions for quality control, diversity analysis, identification of enriched sequences and clustering. alpseq also creates a user-friendly interactive report which empowers scientists to explore their data without the need for extensive bioinformatic experience. Sophisticated panning campaign designs are supported, such as replicates and comparisons between different pans to find cross-binding leads. alpseq thus generates insights into the nanobody selection process and delivers a list of lead candidates for further experimental validation and downstream applications. alspeq is available at https://github.com/kzeglinski/alpseq.

## Full-text entities

- **Diseases:** CPM (MESH:D009845)
- **Chemicals:** BLOSUM62 (-), amino acids (MESH:D000596)
- **Species:** Vicugna pacos (alpaca, species) [taxon 30538], Saccharomyces cerevisiae (baker's yeast, species) [taxon 4932], Lama glama (llama, species) [taxon 9844]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12885427/full.md

## Figures

17 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12885427/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/PMC12885427/full.md

---
Source: https://tomesphere.com/paper/PMC12885427