# CoMPHI: a novel composite machine learning approach utilizing multiple feature representation to predict hosts of bacteriophages

**Authors:** Shreyashi Bodaka, Narasaiah Kolliputi

PMC · DOI: 10.3389/fbinf.2025.1622931 · Frontiers in Bioinformatics · 2025-10-16

## TL;DR

CoMPHI is a new machine learning model that predicts which bacteria a phage can infect, combining sequence data and alignment scores to improve accuracy.

## Contribution

The novel integration of multiple feature representations and alignment scores significantly improves phage-host interaction prediction.

## Key findings

- CoMPHI achieved AUC-ROC values of 94–96.7% and accuracies of 92.3–95.1% in cross-validation.
- Including alignment scores improved performance by 6–8% compared to models without them.
- Combining nucleotide and protein features with alignment data significantly enhanced prediction accuracy.

## Abstract

Phage therapy has reemerged as a compelling alternative to antibiotics in treating bacterial infections, especially for superbugs that have developed antibiotic resistance. The challenge in the broader application of phage therapy is identifying host targets for the vast array of uncharacterized phages obtained through next-generation sequencing. We introduce a Composite Model for Phage Host Interaction (CoMPHI) that integrates alignment-based approaches with machine learning. The model generates multiple feature encodings from nucleotide and protein sequences of both phages and hosts. It incorporates alignment scores between phage-phage, phage-host, and host-host pairs, creating a composite prediction framework. During 5-fold cross-validation, CoMPHI achieved Area Under the ROC Curve (AUC-ROC) values of 94–96.7% and accuracies of 92.3–95.1% across taxonomic levels from species to phylum. Comparative analysis showed a 6–8% performance improvement when alignment scores were included. Ablation studies demonstrated that combining nucleotide and protein encodings, along with phage-host, host-host, and phage-phage alignment scores, significantly enhanced prediction accuracy. CoMPHI provides a robust and comprehensive framework for predicting phage-host interactions. By combining sequence features and alignment information, the model advances computational tools that can accelerate the application of phage therapy in modern medicine.

## Full-text entities

- **Diseases:** bacterial infections (MESH:D001424)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12571911/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12571911/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/PMC12571911/full.md

---
Source: https://tomesphere.com/paper/PMC12571911