# Harpy: a pipeline for processing haplotagging linked-read data

**Authors:** Pavel V Dimens, Ryan P Franckowiak, Azwad Iqbal, Jennifer K Grenier, Paul R Munn, Nina Overgaard Therkildsen

PMC · DOI: 10.1093/bioadv/vbaf133 · 2025-06-05

## TL;DR

Harpy is a new software pipeline for processing haplotagged linked-read sequencing data, enabling haplotype phasing and structural variant detection.

## Contribution

Harpy introduces a modular pipeline specifically designed for haplotagging data, which existing tools cannot handle.

## Key findings

- Harpy processes raw haplotagged data into phased genotypes and structural variant calls.
- The pipeline is modular and user-friendly for researchers working with linked-read sequencing.
- Harpy fills a gap in analytical tools for haplotagging data.

## Abstract

Haplotagging is a method for linked-read sequencing, which leverages the cost-effectiveness and throughput of short-read sequencing while retaining part of the long-range haplotype information captured by long-read sequencing. Despite its utility and advantages over similar methods, existing linked-read analytical pipelines are incompatible with haplotagging data.

We describe Harpy, a modular and user-friendly software pipeline for processing all stages of haplotagged linked-read data, from raw sequence data to phased genotypes and structural variant detection.

https://github.com/pdimens/harpy.

## Full-text entities

- **Genes:** MUC1 (mucin 1, cell surface associated) [NCBI Gene 4582] {aka ADMCKD, ADMCKD1, ADTKD2, CA 15-3, CD227, Ca15-3}
- **Diseases:** cancer (MESH:D009369)

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12198493/full.md

---
Source: https://tomesphere.com/paper/PMC12198493