Prediction of bacterial protein–compound interactions with only positive samples
Ki-Hwa Kim, Avinash Yaganapu, Sai Kosaraju, Aashish Bhatt, Yun Lyna Luo, Sai Phani Parsa, Juyeon Park, Hyun Lee, Jun Hyuck Lee, Tae-Jin Oh, Mingon Kang

TL;DR
This paper introduces a new method to predict interactions between bacterial proteins and compounds using only positive examples, which is important for drug discovery and biotechnology.
Contribution
A novel Positive-Unlabeled learning framework called BIN-PU is proposed for bacterial CPI prediction without negative samples.
Findings
BIN-PU outperforms existing PU models in predicting bacterial CPIs using only positive samples.
BIN-PU's performance was validated on bacterial CYP data and confirmed with biological experiments.
The method is reproducible and effective on uncurated and human CYP datasets.
Abstract
Prediction of Compound–Protein Interactions (CPI) in bacteria is crucial to advance various pharmaceutical and chemical engineering fields, including biocatalysis, drug discovery, and industrial processing. However, current CPI models cannot be applied for bacterial CPI prediction due to the lack of curated negative interaction samples. We propose a novel Positive-Unlabeled (PU) learning framework, named BIN-PU, to address this limitation. BIN-PU generates pseudo positive and negative labels from known positive interaction data, enabling effective training of deep learning models for CPI prediction. We also propose a weighted positive loss function that weights to truly positive samples. We have validated BIN-PU coupled with multiple CPI backbone models, comparing the performance with the existing PU models using bacterial cytochrome P450 (CYP) data. Extensive experiments demonstrate…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Computational Drug Discovery Methods · vaccines and immunoinformatics approaches
