# IgPose: a generative data-augmented pipeline for robust immunoglobulin–antigen binding prediction

**Authors:** Tien-Cuong Bui, Injae Chung, Wonjun Lee, Junsu Ko, Juyong Lee

PMC · DOI: 10.1093/bioinformatics/btag076 · Bioinformatics · 2026-02-15

## TL;DR

IgPose is a new computational tool that improves the prediction of how antibodies bind to antigens using a data-augmented pipeline and advanced neural networks.

## Contribution

The novel contribution is the integration of a generative data-augmentation pipeline and a dual-network framework for robust Ig–Ag binding prediction.

## Key findings

- IgPose outperforms physics and deep learning baselines on CASP-16 and internal test sets.
- The Structural Immunoglobulin Decoy Database (SIDD) provides high-fidelity synthetic data to address data scarcity.
- The framework combines geometric and evolutionary features for improved generalization across diverse interfaces.

## Abstract

Predicting immunoglobulin–antigen (Ig–Ag) binding remains a significant challenge due to the paucity of experimentally resolved complexes and the limited accuracy of de novo Ig structure prediction.

We introduce IgPose, a generalizable framework for Ig–Ag pose identification and scoring, built on a generative data-augmentation pipeline. To mitigate data scarcity, we constructed the Structural Immunoglobulin Decoy Database (SIDD), a comprehensive repository of high-fidelity synthetic decoys. IgPose integrates equivariant graph neural networks, ESM-2 embeddings, and gated recurrent units to synergistically capture both geometric and evolutionary features. We implemented interface-focused k-hop sampling with biologically guided pooling to enhance generalization across diverse interfaces. The framework comprises two sub-networks—IgPoseClassifier for binding pose discrimination and IgPoseScore for DockQ score estimation—and achieves robust performance on curated internal test sets and the CASP-16 benchmark compared to physics and deep learning baselines. IgPose serves as a versatile computational tool for high-throughput antibody discovery pipelines by providing accurate pose filtering and ranking.

IgPose is available on GitHub (https://github.com/arontier/igpose).

## Full-text entities

- **Genes:** HLA-C (major histocompatibility complex, class I, C) [NCBI Gene 3107] {aka D6S204, HLA-JY3, HLAC, HLC-C, MHC, PSORS1}, TRBV20OR9-2 (T cell receptor beta variable 20/OR9-2 (non-functional)) [NCBI Gene 6962] {aka CDR3, TCRBV20S2, TCRBV2O, TCRBV2S2O}, CASP16P (caspase 16, pseudogene) [NCBI Gene 197350] {aka CASP16}
- **Diseases:** DL (MESH:D007859), SID-R (MESH:C580424), SID (MESH:D020914)
- **Chemicals:** -acid (MESH:D000143), AbEpiTope (-), hydrogens (MESH:D006859), amino acids (MESH:D000596), peptide (MESH:D010455)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12989135/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12989135/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/PMC12989135/full.md

---
Source: https://tomesphere.com/paper/PMC12989135