CSI: Contrastive Data Stratification for Interaction Prediction and its Application to Compound-Protein Interaction Prediction
Apurva Kalia (1), Dilip Krishnan (2), Soha Hassoun (1) ((1) Tufts, University, (2) Google Research)

TL;DR
This paper introduces CSI, a contrastive data stratification method that leverages multi-view data partitioning to improve interaction prediction, demonstrated on compound-protein interactions.
Contribution
It proposes a novel contrastive learning framework that uses data stratification to enhance interaction prediction accuracy.
Findings
Improved prediction performance on compound-protein interaction datasets
Effective use of multi-view data partitioning in contrastive learning
Enhanced object representations for interaction tasks
Abstract
Accurately predicting the likelihood of interaction between two objects (compound-protein sequence, user-item, author-paper, etc.) is a fundamental problem in Computer Science. Current deep-learning models rely on learning accurate representations of the interacting objects. Importantly, relationships between the interacting objects, or features of the interaction, offer an opportunity to partition the data to create multi-views of the interacting objects. The resulting congruent and non-congruent views can then be exploited via contrastive learning techniques to learn enhanced representations of the objects.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Biomedical Text Mining and Ontologies · Machine Learning in Bioinformatics
MethodsInfoNCE · Contrastive Multiview Coding
