# SqueezeCall: nanopore basecalling using a Squeezeformer network

**Authors:** Zhongxu Zhu, Zhongxu ZHU, Zhongxu ZHU, Zhongxu ZHU

PMC · DOI: 10.46471/gigabyte.148 · 2025-02-14

## TL;DR

SqueezeCall is a new model for nanopore sequencing that improves accuracy by using a Squeezeformer network to handle electrical current data.

## Contribution

The novel use of a Squeezeformer network for nanopore basecalling, which outperforms existing models.

## Key findings

- SqueezeCall improves basecalling accuracy by effectively handling noise in nanopore sequencing data.
- Combining three types of loss during training enhances basecalling performance.
- SqueezeCall outperforms recurrent and Transformer-based models in basecalling accuracy.

## Abstract

Nanopore sequencing, a third-generation sequencing technique, enables direct RNA sequencing, real-time analysis, and long-read length. Nanopore sequencers measure electrical current changes as nucleotides pass through nanopores; a basecaller identifies base sequences according to the raw current measurements. However, accurate basecalling remains challenging due to molecular variations and sequencing noise. Here, we introduce SqueezeCall, a novel Squeezeformer-based model for accurate nanopore basecalling. SqueezeCall uses convolution layers to down-sample raw signals and model local dependencies. A Squeezeformer network captures the global context, and a connectionist temporal classification (CTC) decoder with beam search generates DNA sequences. Experimental results demonstrated SqueezeCall’s ability to resist noise, improving basecalling accuracy. We trained SqueezeCall combining three types of loss, and found that all three loss types contribute to basecalling accuracy. Experiments across multiple species demonstrated the potential of a Squeezeformer-based model to improve basecalling accuracy and its superiority over recurrent neural network-based models and Transformer-based models.

## Full-text entities

- **Diseases:** CRF (MESH:D005128), MS (MESH:D009103), ONT (MESH:C000719218), CTC (MESH:D008310)
- **Chemicals:** squeezecall (-)
- **Species:** Serratia marcescens (species) [taxon 615], Haemophilus haemolyticus (species) [taxon 726], Klebsiella pneumoniae (species) [taxon 573], Acinetobacter pittii (species) [taxon 48296], Homo sapiens (human, species) [taxon 9606], Shigella sonnei (species) [taxon 624], Staphylococcus aureus (species) [taxon 1280], Stenotrophomonas maltophilia (species) [taxon 40324]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11851125/full.md

---
Source: https://tomesphere.com/paper/PMC11851125