# A machine learning framework to identify complex physicochemical features of B cell epitopes

**Authors:** Simranjit Grewal, Uwa Iyamu, Daniel Vinals, Catherine Mitran, Nidhi Hegde, Stephanie Yanow

PMC · DOI: 10.21203/rs.3.rs-6255613/v1 · Research Square · 2025-04-18

## TL;DR

Researchers developed a machine learning framework to identify conserved B cell epitopes in VAR2CSA, a malaria protein, using antibody reactivity data.

## Contribution

A novel machine learning framework was developed to identify complex physicochemical features of conserved B cell epitopes in VAR2CSA.

## Key findings

- The framework identified both linear and conformational epitopes recognized by polyreactive antibodies.
- Features associated with antibody reactivity were mapped onto 3D structures of Plasmodium proteins.
- Mutant peptides validated complex sequence motifs predicted by the model.

## Abstract

During infection with Plasmodium falciparum in pregnancy, parasites express a unique virulence factor, VAR2CSA, that mediates binding of infected red blood cells to the placenta. A major goal in designing vaccines to protect pregnant women from malaria is to elicit antibodies to VAR2CSA. The challenge is that VAR2CSA is highly polymorphic and identifying conserved epitopes is essential to elicit strain-transcending immunity. Unexpectedly, a mouse monoclonal antibody, 3D10, raised against the unrelated Duffy binding protein from P. vivax (DBPII) cross-reacts with diverse alleles of VAR2CSA in vitro. To identify these potentially conserved epitopes in VAR2CSA, we designed a machine learning framework to analyse 3D10 reactivity to peptides derived from two alleles of VAR2CSA, DBPII, and PvEBP2 (negative control). We used decision trees and a panel of 430 features to extract features correlated to 3D10 binding. We analysed patterns of these features in the dataset and designed mutant peptides to test complex sequence motifs. Features associated with 3D10 reactivity were mapped onto predicted 3D structures of Plasmodium proteins and validated based on 3D10 reactivity to the recombinant antigens. While the array data identified certain linear epitopes, the framework predicted other epitopes that are conformational. With this approach, peptide array data can be mined to extract physicochemical properties of epitopes recognized by polyreactive antibodies.

## Linked entities

- **Diseases:** malaria (MONDO:0005136)
- **Species:** Plasmodium falciparum (taxon 5833), Plasmodium vivax (taxon 5855)

## Full-text entities

- **Diseases:** malaria (MESH:D008288)
- **Species:** Homo sapiens (human, species) [taxon 9606], Plasmodium falciparum (malaria parasite P. falciparum, species) [taxon 5833], Mus musculus (house mouse, species) [taxon 10090]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12047986/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12047986/full.md

## References

74 references — full list in the complete paper: https://tomesphere.com/paper/PMC12047986/full.md

---
Source: https://tomesphere.com/paper/PMC12047986