A deep Siamese network framework for precision phage selection in pulmonary infections
Xinghong Wang, Mingpeng Fu, Shigang Lin, Yanshuang Wang, Hua Pei

TL;DR
This paper introduces a deep learning framework to accurately select phages for treating resistant pulmonary infections, improving phage therapy potential.
Contribution
A novel deep Siamese network framework for precise phage–host interaction prediction in pulmonary infections.
Findings
The proposed method outperforms existing approaches in identifying phages targeting specific bacterial hosts.
Integration of CNNs and Transformers enhances the extraction of both local and global genomic features.
The framework shows potential for clinical application in phage therapy for resistant pulmonary infections.
Abstract
Pulmonary infections pose a significant global health challenge to human life and health. In patients with chronic pulmonary diseases such as cystic fibrosis and bronchiectasis, structural abnormalities of the airways and impaired mucociliary clearance contribute to recurrent and challenging pulmonary infections. These infections are frequently complicated by antimicrobial resistance, making them difficult to treat with conventional antibiotics. As a result, phage therapy has emerged as a promising alternative for treating resistant pulmonary infections. Recently, the integration of artificial intelligence (AI) has improved the efficiency of phage selection. Nevertheless, the accuracy of predicting phage–bacterial host interactions remains limited, posing a significant obstacle to the clinical translation of phage-based therapies. To address this issue, we propose a deep Siamese network…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1|
|
|
|
|
|
|
|---|---|---|---|---|---|
| WIsH | 33.0 | 42.0 | 45.0 | 69.0 | 72.0 |
| VHM | 33.4 | 45.0 | 53.0 | 70.0 | 75.0 |
| PHP | 34.0 | 45.0 | 61.0 | 78.0 | 80.0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|
|
|
|
|
|
|
|
| Skip-gram + CNN (without transformer) | 31.5 | 43.7 | 60.6 | 77.0 | 78.2 |
| CNN + Transformer (without skip-gram) | 32.0 | 44.2 | 61.1 | 77.5 | 78.7 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBacteriophages and microbial interactions · Machine Learning in Bioinformatics · vaccines and immunoinformatics approaches
Introduction
1
Pulmonary infections, particularly those involving the lower respiratory tract, represent a major global public health challenge and pose a substantial threat to human health. Lower respiratory tract infections are among the most lethal infectious diseases worldwide and rank as the fourth leading cause of death (1). Their incidence and severity are notably elevated in patients with chronic pulmonary diseases, such as cystic fibrosis (CF) and bronchiectasis, in whom abnormal airway architecture and impaired mucociliary clearance create a favorable niche for persistent bacterial colonization and proliferation. In individuals with CF, progressive pulmonary disease remains the primary cause of morbidity and mortality, with recurrent and chronic infections serving as a dominant driver of pathological deterioration (2, 3). Currently, antibiotics remain a relatively effective treatment for pulmonary infections, however, the emergence of antibiotic resistance has made the treatment of pulmonary infections more complex and difficult. The World Health Organization has listed antibiotic-resistant bacteria as one of the three major public health threats of the 21st century (1). In 2019, antibiotic-resistant diseases directly caused 1.27 million deaths and indirectly caused 4.95 million deaths, with pneumonia and other lower respiratory tract infections being the leading causes of death, accounting for 1.9 million deaths. More worryingly, if the current trend continues, by 2050, diseases caused by antibiotic-resistant bacteria could cause 10 million deaths annually (1, 4). Among the multitude of drug-resistant bacteria, the ESKAPE pathogens—Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species—are particularly noteworthy for their ability to “escape” conventional antibiotic treatments and rapidly proliferate in hospital settings (5). Among them, Carbapenem-resistant Acinetobacter baumannii (CRAB) is an important pathogen of pneumonia. Meanwhile, Pseudomonas aeruginosa is one of the most frequently isolated pathogens from CF patients, exhibiting resistance to multiple antibiotics (6, 7).
Faced with the increasingly severe challenge of antibiotic resistance, clinicians have had to rely on a few still-active antibiotics, such as minocycline, to treat CRAB pneumonia (6, 7). However, these "last-line" antibiotics also face the risk of future ineffectiveness. To date, the main challenges in treating pulmonary infections include: (1) biofilm formation, which reduces antibiotic permeability and efficacy; (2) the rapid development of bacterial resistance; (3) chronic infections requiring long-term treatment, increasing the risk of side effects; and (4) the destructive effects of antibiotics on the host microbiome. Therefore, the development of novel anti-infection strategies is urgently needed. Phage therapy has regained attention in recent years as an alternative treatment option (4, 8). Phages are viruses that naturally reside in bacteria, capable of infecting and killing their bacterial hosts, including antibiotic-resistant strains. Phage therapy offers several advantages: high specificity (it does not harm the host microbiota), self-replication capability (it multiplies at the site of infection), and potent anti-biofilm activity (5). Significant progress has been made in the treatment of drug-resistant pulmonary infections using this approach (5). Multiple preclinical studies and compassionate use cases have demonstrated the significant efficacy of phage therapy for infections that cannot be cured by other methods. Furthermore, phage-antibiotic synergy (PAS) has emerged as a promising strategy to overcome the limitations of monotherapy (1, 5).
Despite the immense potential of phage therapy, its widespread application faces numerous challenges. Traditional phage screening relies on empirical trial-and-error methods, a cumbersome and time-consuming process (9). Furthermore, bacterial resistance to phages evolves rapidly, primarily through mechanisms such as receptor blocking and CRISPR-Cas system activation (9, 10). To overcome these limitations, AI technology (11, 12) has been introduced into phage research, significantly improving screening efficiency. For instance, machine learning algorithms facilitate the prediction of phage–host interactions, enabling the identification of highly efficient phage–bacteria pairs (13). AI-assisted genome editing further provides powerful tools for tailoring phage properties (14). AI-based approaches also allow prediction of phage lifestyle (lytic vs. lysogenic) (9, 10). Nevertheless, current AI methods remain substantially limited in accurately forecasting phage–bacteria interactions, such as the inability to extract deeper characterizations, thus limiting their translational potential in clinical practice.
To address the above issue, we present a deep Siamese network framework for precision phage selection in pulmonary infections. Specifically, this framework utilizes a unified model architecture to process both phage and host genomes. First, the genomic sequences of both are encoded using the k-mer method. Then, CNN (15) and Transformers (16, 17) are employed to extract local and global features, respectively (18). Finally, the extracted features are integrated to predict phage–host interactions. Experimental results on a dataset created from the NCBI genome database demonstrate that our proposed method achieves state-of-the-art performance. Our proposed framework enables the screening of matching phages for specific bacterial hosts, thereby supporting phage therapy in pulmonary infections.
In summary, our contributions are as follows:
We propose a deep Siamese network framework that efficiently screens phages matching specific bacterial hosts, thereby supporting the application of phage therapy in pulmonary infections.We incorporate CNN and Transformer architectures into the deep Siamese network to extract complementary local and global representations of phages and their hosts, thereby enabling more accurate prediction of phage–host interactions.
Related work
2
Phage-host interactions
2.1
Predicting phage–host interactions is fundamental to microbial ecology and phage therapy, and is essential for elucidating phage–host networks and identifying optimal therapeutic phages. Traditional approaches primarily rely on sequence alignment tools such as NCBI BLAST (19) to infer host relationships through homology searches. However, the high diversity and rapid evolution of phage genomes often lead to weak homology signals, making these methods insufficiently sensitive for metagenomic data, especially when dealing with short fragments or distant evolutionary relationships. To address these limitations, a series of alignment-free methods has been proposed. For example, oligonucleotide frequency–based approaches predict hosts by comparing differences in k-mer distributions (20), achieving high prediction accuracy for metagenomic viral sequences. Similarly, WIsH (21) applies a Markov model to predict hosts for phage contigs, while PHP (22) employs a Gaussian model to analyze prokaryotic virus sequences. By eliminating dependency on sequence alignment (23, 24), these methods enhance both robustness and computational efficiency. With the advancement of machine learning, protein feature–based methods have further improved prediction performance. For instance, RaFAH (25) accurately identifies bacterial and archaeal hosts by analyzing viral protein composition combined with a classifier model. HoPhage (26), a de novo prediction tool designed for metavirome fragments, integrates both sequence and protein features to effectively handle short viral contigs. In addition, network-based methods such as (27) integrate multiple data sources—including sequence similarity and co-occurrence networks—into a unified framework for systematically predicting virus–prokaryote interactions.
Application of AI in phage therapy
2.2
In addition to phage–host interaction prediction, AI is increasingly being applied to other key aspects of phage therapy, accelerating its transition from laboratory research to clinical practice. In the prediction of phage protein function, the tool PhageDPO employs a support vector machine model to identify depolymerases—encoded within phage genomes—that are capable of degrading bacterial polysaccharide structures, thereby offering novel targets for disrupting biofilm barriers (28). In the realm of phage genome design, the Prophage Activation platform utilizes a random forest model to predict regulatory elements in prophages and enables the activation of dormant prophages by simulating edits to transcription factor binding sites (13). This approach provides a new strategy for precise antimicrobial intervention from within drug-resistant bacteria. At the clinical level, AI is further harnessed to optimize phage cocktail formulations and monitor the emergence of bacterial resistance (29). Together, these applications underscore the considerable potential of AI technologies in advancing phage therapy from empirical screening toward a rational, design-based framework.
Method
3
Data pre-processing
3.1
This study adopted the dataset construction scheme consistent with VirHostMatcher-Net (27) to ensure comparability. We collected 2,288 phage genomes with known host information from the NCBI RefSeq database (accessed November 11, 2019). From these, 826 phages with specifically identified hosts at the strain level were selected to construct the positive training set, including 817 bacteria-infecting and 9 archaea-infecting phages. The remaining 1,462 phage-host pairs were held out as the final independent test set for all performance evaluations. Host information was extracted from the “isolate_host” or “host” fields in the GenBank records (27). For negative sample construction, we implemented a taxonomic distance-based sampling strategy by randomly selecting 826 phage-host pairs with the requirement that each selected host belongs to a different phylum from the phage's true host. While we acknowledge potential false negatives in these non-interacting pairs, the impact is minimized by the specificity of phage-host interactions and the phylum-level exclusion criterion (27).
Deep Siamese network framework
3.2
We propose a deep Siamese network framework to accurately predict phage-host interactions, thereby selecting a precise phage for the treatment of pulmonary infections. The model formulates the task as a binary classification problem (match/non-match), with the core idea being to learn high-level features from the complete genome sequences of both phages and hosts, and then evaluate the likelihood of their interaction.
The overall architecture, as shown in Figure 1, consists of four core modules: a Sequence Preprocessing and Feature Representation module responsible for segmenting long genome sequences and converting them into numerical feature matrices; a Local Feature Learning module that uses CNN to extract local patterns from sequence segments; a Global Dependency Learning module that employs Transformer encoders to integrate all local features, forming a whole-genome representation; and a Fully Connected Classifier module for match prediction. Model training aims to minimize the cross-entropy loss function, optimizing all parameters via the backpropagation algorithm.
Overall architecture of the proposed Deep Siamese Network Framework. The framework consists of four core modules: (1) Sequence Preprocessing and Feature Representation module responsible for segmenting long genome sequences and converting them into numerical feature matrices; (2) Local Feature Learning module that uses CNN to extract local patterns from sequence segments; (3) Global Dependency Learning module that employs Transformer encoders to integrate all local features, forming a whole-genome representation; and (4) Fully Connected Classifier module for match prediction.
Sequence preprocessing and feature representation
3.2.1
First, standardized preprocessing is applied to the input phage and host genome sequences.For a genome sequence of length N (where N is the genome sequence length in base pairs), due to the CNN's requirement for fixed-length inputs, it is uniformly segmented into multiple subsequences, each 2,000 base pairs (bp) long. The total number of subsequences c is calculated as:
For sequences whose length is not an integer multiple of 2,000, a zero-padding strategy is applied at the end. Each subsequence is further processed via k-mer segmentation and transformed using the skip-gram model to map each k-mer into a high-dimensional vector space (30). Here, k denotes the k-mer length (set to k = 3 in this study), and d is the number of hidden units in the skip-gram model (set to d = 64 in this study). The skip-gram model encodes the contextual relationships between adjacent k-mers as similarities in the vector space, thereby effectively enhancing the ability to capture local patterns. Based on this, each 2,000 bp subsequence can be represented as follows:
X is the embedding matrix for the subsequences obtained from segmenting the phage and host genome sequences. The row dimension corresponds to the positions of the k-mers within the subsequence, and the column dimension corresponds to the feature dimensions of the skip-gram embeddings.
Local feature learning
3.2.2
The subsequence matrix X for the phage and host genome sequences is fed into independent CNN for local feature extraction. The two CNN networks share the same architecture but have independent parameters: CNN_phage_ processes the phage sequences, and CNN_host_ processes the host sequences. Each CNN network consists of three convolutional layers and two pooling operations. Each convolutional layer performs three operations: 2D convolution, batch normalization, and the ReLU activation function [defined as relu(x) = max(0, x)]. Subsequently, global average pooling is applied to compress the 2D feature matrix into a fixed-length vector. The design of global average pooling offers a dual advantage: it can map input sequences of different lengths into a fixed-dimensional vector, and it has a regularizing effect, helping to mitigate overfitting.
Let and denote the mapping functions of the respective CNNs, which transform an input subsequence matrix into a 256-dimensional global feature vector (𝒳 denotes the set of all possible subsequence embedding matrices). The outputs for the i-th subsequence are respectively:
where c and d represent the total number of subsequences for the phage and host genomes, respectively, and Yi and Zi represent the local feature representations of each subsequence for the phage and host, respectively.
Global dependency learning
3.2.3
To capture long-range dependencies within the sequences, the sequences of local feature vectors extracted by CNN_phage_ and CNN_host_ are fed into two independent Transformer encoders, respectively. Each Transformer encoder consists of a stack of N = 6 identical layers. Each layer contains two sub-layers: the first is a multi-head self-attention mechanism, and the second is a simple position-wise fully connected feed-forward network. We employ residual connections around each of the two sub-layers, followed by layer normalization. That is, the output of each sub-layer is LayerNorm[x+Sublayer(x)], where Sublayer(x) is the function implemented by the sub-layer itself. To facilitate these residual connections, all sub-layers in the model, as well as the embedding layers, produce outputs of dimension dmodel = 256.
A learnable CLS vector is prepended to the sequence to aggregate global sequence features. For the phage sequence, the input matrix is ; for the host sequence, the input matrix is .
The computation of the multi-head self-attention mechanism is as follows. First, the input sequence is linearly projected into query (Q), key (K), and value (V) matrices:
where are learnable weight matrices, and dk is the dimension of each attention head.
The computation for each attention head is:
where , are the projection matrices specific to the i-th attention head. This paper uses 4 attention heads (h = 4), hence dk = dv = dmodel/h = 256/4 = 64.
The outputs of all attention heads are concatenated and then linearly transformed to produce the final output of the multi-head attention:
where is the output projection matrix. Subsequently, the output is obtained through residual connections and a feed-forward network:
where LayerNorm denotes Layer Normalization and FFN represents the Feed-Forward Network. and are the global feature representations for the phage and host sequences, denoted as Cphage and Chost, respectively.
The phage and host genome sequences undergo the aforementioned CNN and Transformer processing pipeline to obtain their respective global feature vectors Cphage and Chost.
Fully connected classifier
3.2.4
To predict the specific matching relationship between phages and host bacteria, we construct a fully connected neural network classifier. This classifier takes the global feature vectors of the phage and host as input, learns the interaction patterns between them through multiple layers of nonlinear transformations, and outputs a probability distribution for matching. Specifically, the two global representation vectors are first concatenated to form a comprehensive feature vector:
This fused feature is then passed through a multilayer perceptron (MLP) for classification prediction, which can be mathematically summarized as:
where fMLP denotes a multilayer perceptron with two hidden layers. The first layer expands the input dimension from 512 to 4,096, employing the ReLU activation function to introduce nonlinear transformations. The second layer maps the hidden features to a two-dimensional output space, and the Softmax function converts this into a probability distribution p∈ℝ^2^. Here, p[0] represents the probability of non-match, and p[1] represents the probability of a match. The final prediction is determined by the class with the higher probability.
Experiments and results analysis
4
Experiment setup
4.1
The model was trained for 200 epochs using the Adam optimizer with a learning rate of 1 × 10^−4^. The exponential decay rates for the moment estimates (β_1_ and β_2_) were set to 0.9 and 0.999, respectively. We employed a mini-batch training strategy with a batch size of 32. To prevent overfitting, an L2 weight decay regularization term with a coefficient of 1 × 10^−3^ was incorporated into the loss function. Furthermore, gradient clipping was applied to mitigate the exploding gradient problem, with the threshold set to 5.0. During testing, the population statistics (mean and variance) used as fixed parameters for normalization were estimated from the training data. These statistics were computed as the moving average of the mini-batch statistics during training, with a decay rate of 0.999.
All experiments were conducted on a workstation equipped with an NVIDIA GeForce RTX 4090 GPU. The deep learning framework was implemented using PyTorch 1.9.0 with CUDA 11.1 support.
Evaluation metrics
4.2
To systematically evaluate prediction performance, we followed the standard evaluation protocol in phage-host prediction literature and quantitatively assessed the results across five taxonomic levels: Genus, Family, Order, Class, and Phylum. For any given taxonomic level L, the prediction accuracy is defined as:
where Ncorrect, L represents the number of phages whose predicted host matches the true host at taxonomic level L, and Ntotal denotes the total number of phage-host pairs in the test set (1,462 in this study). This hierarchical evaluation framework comprehensively measures model performance at different taxonomic resolutions. We placed particular emphasis on Genus-level and Phylum-level accuracy, which reflect the model's capability for fine-grained host prediction and broad taxonomic assignment, respectively. All evaluations were performed on an independent test set that was completely separate from the training data, ensuring unbiased assessment of model generalization capability.
Comparison with existing methods
4.3
To comprehensively evaluate the predictive performance of our model, we conducted a systematic comparative analysis against current mainstream phage-host prediction methods on a standardized test set (containing 1,462 phages and 62,493 candidate hosts). The compared methods included representative algorithms such as WIsH (based on Markov models), PHP (which uses k-mer similarity), and VirHostMatcher (VHM) (27). All methods were rigorously evaluated at five major taxonomic levels (Genus, Family, Order, Class, and Phylum) to ensure comprehensive and fair comparison. The experimental results shown in Table 1 fully demonstrate the superior performance of our model. In the most challenging genus-level classification task, our model achieved an accuracy of 36.1%, representing an improvement of 2.1% over PHP's 34.0% and 2.7% over VHM's 33.4%, with this enhancement being statistically significant. As the taxonomic level increases, the prediction accuracy of all methods shows an upward trend, which aligns with the general principle that higher taxonomic levels exhibit stronger sequence conservation. Particularly, our model reached 82.8% accuracy at the phylum level, surpassing PHP's 80.0% by 2.8%, demonstrating the model's stable advantage in high-level classification tasks.
The performance differences among various methods profoundly reflect the technical characteristics of their underlying algorithmic principles: The WIsH method, based on Markov models, performs well in short sequence prediction tasks but has limited feature extraction capability when processing complete genome sequences; The VirHostMatcher method utilizes k-mer frequency features and enhances prediction performance through multi-feature fusion; The PHP method employs Gaussian models to process k-mer features and demonstrates good adaptability at higher taxonomic levels; In contrast, our model innovatively introduces deep learning technology into the field of phage-host prediction, effectively capturing local sequence patterns through CNN networks, combined with Transformer architecture to model long-range dependencies, achieving end-to-end learning from raw sequences to prediction outcomes, thereby avoiding the limitations of manual feature design in traditional methods and consequently demonstrating significant advantages across multiple taxonomic levels.
Ablation experiment
4.4
To validate the effectiveness of each key component in our model, we conducted a series of ablation experiments on the testset (1,462 phages, 62,493 candidate hosts). All experiments used the identical training set and hyperparameter settings. The results across five taxonomic levels (Genus, Family, Order, Class, and Phylum) are shown in Table 2.
The ablation experiments reveal distinct contributions from each architectural component: The Skip-gram + CNN combination achieved 31.5% accuracy at the genus level, demonstrating that local pattern extraction through convolutional networks combined with semantic embeddings provides a solid foundation for sequence representation. However, the absence of Transformer encoding resulted in a 4.6% drop compared to the complete model, highlighting the importance of global dependency modeling for capturing long-range interactions in genomic sequences. The performance of CNN + Transformer (without Skip-gram) (32.0% at genus level) indicates that while the CNN-Transformer pipeline can learn meaningful representations from raw sequences, the pre-trained semantic embeddings from Skip-gram provide additional discriminative power. The 4.1% improvement when incorporating Skip-gram embeddings demonstrates their value in capturing biological context and k-mer semantics.
Conclusion
5
In this paper, we propose a deep Siamese network framework that integrates both local and global features for precise phage selection in pulmonary infections. The framework employs identical model architectures to process phage and host genomes simultaneously. First, k-mer processing captures contextual information from local genomic sequences. Then, CNN and Transformers are applied to extract local and global features, respectively. These features are subsequently integrated to predict phage–host interactions. Experimental results on a dataset created from the NCBI genome database confirm that our method achieves superior performance in accurately identifying phages that target specific bacterial hosts, demonstrating its potential for supporting phage therapy in pulmonary infection treatment.
Although our proposed framework demonstrates superior performance in predicting phage-host interactions, several aspects warrant further improvement. Firstly, while the current model primarily relies on sequence features, future work could integrate multi-omics data such as protein structures and epigenetic profiles. Secondly, the feature interaction mechanism should be further optimized to handle sequence pairs with significantly divergent lengths.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Qian Y Zhu Z Zhu J Chen L Du H. Phage therapy: innovative approaches for refractory pulmonary infections. Virus Res. (2025) 361:199649. doi: 10.1016/j.virusres.2025.199649 · doi ↗
- 2Allen L Allen L Carr SB Davies G Downey D Egan M . Future therapies for cystic fibrosis. Nat Commun. (2023) 14:693. doi: 10.1038/s 41467-023-36244-236755044 PMC 9907205 · doi ↗ · pubmed ↗
- 3Bell SC Mall MA Gutierrez H Macek M Madge S Davies JC . The future of cystic fibrosis care: a global perspective. Lancet Respir Med. (2020) 8:65–124. doi: 10.1016/S 2213-2600(19)30337-631570318 PMC 8862661 · doi ↗ · pubmed ↗
- 4Cocorullo M Stelitano G Chiarelli LR. Phage therapy: an alternative approach to combating multidrug-resistant bacterial infections in cystic fibrosis. Int J Mol Sci. (2024) 25:8321. doi: 10.3390/ijms 2515832139125890 PMC 11313351 · doi ↗ · pubmed ↗
- 5Marino A Stracquadanio S Cosentino F Maraolo AE Colpani A De Vito A . Phage to ESKAPE: Personalizing therapy for MDR infections—a comprehensive clinical review. Pathogens. (2025) 14:1011. doi: 10.3390/pathogens 1410101141156622 PMC 12566643 · doi ↗ · pubmed ↗
- 6Seok H Kim SH Shi HJ Choi WS Park DW Wi YM . Minocycline-based therapy for carbapenem-resistant Acinetobacter baumannii pneumonia: a multicenter retrospective study. Int J Infect Dis. (2025) 160:108064. doi: 10.1016/j.ijid.2025.10806440946777 · doi ↗ · pubmed ↗
- 7Burgel PR Bourge X Mackosso C Parquin F. Ceftolozane/tazobactam for the treatment of adults with cystic fibrosis: results from a French prospective cohort study. In: Open Forum Infectious Diseases. Oxford University Press US (2024). p. ofae 391. doi: 10.1093/ofid/ofae 39139108933 PMC 11303002 · doi ↗ · pubmed ↗
- 8Liu Z Tan X Xiong M Lu S Yang Y Zhu H . Efficacy of precisely tailored phage cocktails targeting carbapenem-resistant Acinetobacter baumannii reveals evolutionary trade-offs: a proof-of-concept study. E Bio Medicine. (2025) 120:105942. doi: 10.1016/j.ebiom.2025.10594240972224 PMC 12478281 · doi ↗ · pubmed ↗
