DRAK: Unlocking Molecular Insights with Domain-Specific Retrieval-Augmented Knowledge in LLMs
Jinzhe Liu, Xiangsheng Huang, Zhuo Chen, Yin Fang

TL;DR
DRAK is a domain-specific retrieval-augmented framework that significantly improves LLMs' ability to analyze complex molecular data by injecting specialized knowledge, surpassing previous benchmarks in molecular tasks.
Contribution
The paper introduces DRAK, a novel non-parametric knowledge injection framework that enhances LLM reasoning in molecular domains through knowledge-aware prompts and reasoning techniques.
Findings
DRAK outperforms previous benchmarks on six molecular tasks.
DRAK demonstrates strong reasoning capabilities in molecular analysis.
The framework is adaptable to various knowledge-intensive tasks.
Abstract
Large Language Models (LLMs) encounter challenges with the unique syntax of specific domains, such as biomolecules. Existing fine-tuning or modality alignment techniques struggle to bridge the domain knowledge gap and understand complex molecular data, limiting LLMs' progress in specialized fields. To overcome these limitations, we propose an expandable and adaptable non-parametric knowledge injection framework named Domain-specific Retrieval-Augmented Knowledge (DRAK), aimed at enhancing reasoning capabilities in specific domains. Utilizing knowledge-aware prompts and gold label-induced reasoning, DRAK has developed profound expertise in the molecular domain and the capability to handle a broad spectrum of analysis tasks. We evaluated two distinct forms of DRAK variants, proving that DRAK exceeds previous benchmarks on six molecular tasks within the Mol-Instructions dataset. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLibrary Science and Information Systems · Natural Language Processing Techniques
