Exploring Optimal Transport-Based Multi-Grained Alignments for Text-Molecule Retrieval
Zijun Min, Bingshuai Liu, Liang Zhang, Jia Song, Jinsong Su, Song He,, Xiaochen Bo

TL;DR
This paper introduces ORMA, a novel multi-grained alignment model using optimal transport and contrastive learning to improve text-molecule retrieval by capturing hierarchical sub-structure details.
Contribution
The paper presents the first multi-grained alignment approach at motif and multi-token levels for text-molecule retrieval, leveraging optimal transport and hierarchical molecule modeling.
Findings
ORMA outperforms state-of-the-art models on ChEBI-20 and PCdes datasets.
Multi-grained alignments improve retrieval accuracy.
Optimal transport effectively aligns tokens with molecular motifs.
Abstract
The field of bioinformatics has seen significant progress, making the cross-modal text-molecule retrieval task increasingly vital. This task focuses on accurately retrieving molecule structures based on textual descriptions, by effectively aligning textual descriptions and molecules to assist researchers in identifying suitable molecular candidates. However, many existing approaches overlook the details inherent in molecule sub-structures. In this work, we introduce the Optimal TRansport-based Multi-grained Alignments model (ORMA), a novel approach that facilitates multi-grained alignments between textual descriptions and molecules. Our model features a text encoder and a molecule encoder. The text encoder processes textual descriptions to generate both token-level and sentence-level representations, while molecules are modeled as hierarchical heterogeneous graphs, encompassing atom,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Advanced Text Analysis Techniques
MethodsContrastive Learning · ALIGN
