TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence
Feng Jiang, Mangal Prakash, Hehuan Ma, Jianyuan Deng, Yuzhi Guo, Amina Mollaysa, Tommaso Mansi, Rui Liao, Junzhou Huang

TL;DR
TRIDENT is a multimodal framework that integrates molecular structures, textual descriptions, and taxonomic annotations to improve molecular property prediction by aligning features at both global and local levels.
Contribution
It introduces a novel tri-modal learning approach with a volume-based and local alignment strategy, utilizing a curated dataset and a dynamic balancing mechanism.
Findings
Achieves state-of-the-art results on 11 downstream tasks.
Effectively captures both broad and fine-grained structure-function relationships.
Demonstrates the benefit of integrating textual and taxonomic information in molecular representations.
Abstract
Molecular property prediction aims to learn representations that map chemical structures to functional properties. While multimodal learning has emerged as a powerful paradigm to learn molecular representations, prior works have largely overlooked textual and taxonomic information of molecules for representation learning. We introduce TRIDENT, a novel framework that integrates molecular SMILES, textual descriptions, and taxonomic functional annotations to learn rich molecular representations. To achieve this, we curate a comprehensive dataset of molecule-text pairs with structured, multi-level functional annotations. Instead of relying on conventional contrastive loss, TRIDENT employs a volume-based alignment objective to jointly align tri-modal features at the global level, enabling soft, geometry-aware alignment across modalities. Additionally, TRIDENT introduces a novel local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning in Bioinformatics · Biomedical Text Mining and Ontologies · Genomics and Phylogenetic Studies
MethodsALIGN
