# A CNN-Based Model of Cross-Immunity to Influenza A(H3N2) Virus: Testing Under “Real-World” Conditions

**Authors:** Marina N. Asatryan, Vaagn G. Agasaryan, Boris I. Timofeev, Ilya S. Shmyr, Dmitrii N. Shcherbinin, Elita R. Gerasimuk, Tatiana A. Timofeeva, Ivan F. Ershov, Tatiana A. Semenenko, Denis Yu. Logunov, Alexander L. Gintsburg

PMC · DOI: 10.3390/v18030327 · Viruses · 2026-03-06

## TL;DR

A CNN model was developed to predict cross-immunity to influenza A(H3N2) using antigenic data and hemagglutinin sequences, showing strong performance in both historical and recent data.

## Contribution

The novel contribution is a CNN-based model for influenza cross-immunity prediction validated under real-world forecasting conditions.

## Key findings

- The CNN model achieved high accuracy (0.9996) on historical data and maintained robust performance (Accuracy: 0.73–0.81) on recent data.
- Three-layer CNNs outperformed two-layer models in handling complex real-world data.
- The model demonstrated strong discriminative ability (AUC ≥ 0.805) and good calibration (Brier scores ≤ 0.192).

## Abstract

A cross-immunity model for influenza A(H3N2) based on convolutional neural networks (CNNs) was developed and validated under temporally structured conditions that mimic real-world forecasting. Antigenic distance was derived from hemagglutination inhibition (HI) titers. The model was trained on WHO data (2011–2023) and tested in a time-split fashion on independent recent data (2022–2024). Hemagglutinin sequences (HA/HA1) were encoded into 3D tensors using five physicochemical indices from AAindex. Two- and three-layer CNN architectures were tested. Performance was evaluated using Accuracy, Sensitivity, Specificity, and Matthews Correlation Coefficient (MCC) with 95% confidence intervals. Validation on the classic Smith’s dataset showed high accuracy (Accuracy = 0.9996, MCC = 0.9964), serving as a necessary sanity check. Testing on current data yielded lower but robust results (Accuracy: 0.73–0.81, MCC: 0.48–0.60), reflecting real-world forecasting complexity. ROC analysis confirmed the strong discriminative ability (AUC ≥ 0.805) and good calibration (Brier scores ≤ 0.192). The three-layer CNN demonstrated greater robustness on challenging data. This CNN model is an effective tool for assessing influenza A(H3N2) antigenic distances and holds promise for integration into epidemiological models to aid vaccine strain selection. Further accuracy improvements may arise from modeling the structural impact of amino acid substitutions and polyclonal immune responses.

## Linked entities

- **Proteins:** ha (hair bristles), KRT31 (keratin 31)

## Full-text entities

- **Species:** H3N2 subtype (serotype) [taxon 119210]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13030590/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13030590/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/PMC13030590/full.md

---
Source: https://tomesphere.com/paper/PMC13030590