TL;DR
MolReFlect introduces a teacher-student framework enabling large language models to learn fine-grained, explainable alignments between molecular sub-structures and textual descriptions, improving molecule-caption translation.
Contribution
The paper presents MolReFlect, a novel method for automatic learning of detailed molecule-text alignments using a teacher-student LLM framework, enhancing explainability and performance.
Findings
Achieved state-of-the-art results in molecule-caption translation.
Enabled LLMs to learn fine-grained molecule-text alignments automatically.
Significantly outperformed previous baseline methods.
Abstract
Molecule discovery is a pivotal research field, impacting everything from medicine to materials. Recently, Large Language Models (LLMs) have been widely adopted in molecular understanding and generation, serving as a bridge between the molecular space and the natural language space, yet the alignment between molecules and their corresponding captions remains a significant challenge. Previous endeavors typically treat molecules as monolithic inputs, lacking an intermediate reasoning process and sacrificing explainability. In this work, we define fine-grained alignments as the precise correspondence between a molecule's sub-structures and the textual phrases that explain their properties. These alignments are crucial for LLMs to understand molecules in a more accurate and explainable manner. Normally, such fine-grained alignments require expert annotation, which is both costly and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
