DRAK: Unlocking Molecular Insights with Domain-Specific   Retrieval-Augmented Knowledge in LLMs

Jinzhe Liu; Xiangsheng Huang; Zhuo Chen; Yin Fang

arXiv:2406.18535·q-bio.BM·June 28, 2024

DRAK: Unlocking Molecular Insights with Domain-Specific Retrieval-Augmented Knowledge in LLMs

Jinzhe Liu, Xiangsheng Huang, Zhuo Chen, Yin Fang

PDF

Open Access

TL;DR

DRAK is a domain-specific retrieval-augmented framework that significantly improves LLMs' ability to analyze complex molecular data by injecting specialized knowledge, surpassing previous benchmarks in molecular tasks.

Contribution

The paper introduces DRAK, a novel non-parametric knowledge injection framework that enhances LLM reasoning in molecular domains through knowledge-aware prompts and reasoning techniques.

Findings

01

DRAK outperforms previous benchmarks on six molecular tasks.

02

DRAK demonstrates strong reasoning capabilities in molecular analysis.

03

The framework is adaptable to various knowledge-intensive tasks.

Abstract

Large Language Models (LLMs) encounter challenges with the unique syntax of specific domains, such as biomolecules. Existing fine-tuning or modality alignment techniques struggle to bridge the domain knowledge gap and understand complex molecular data, limiting LLMs' progress in specialized fields. To overcome these limitations, we propose an expandable and adaptable non-parametric knowledge injection framework named Domain-specific Retrieval-Augmented Knowledge (DRAK), aimed at enhancing reasoning capabilities in specific domains. Utilizing knowledge-aware prompts and gold label-induced reasoning, DRAK has developed profound expertise in the molecular domain and the capability to handle a broad spectrum of analysis tasks. We evaluated two distinct forms of DRAK variants, proving that DRAK exceeds previous benchmarks on six molecular tasks within the Mol-Instructions dataset. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLibrary Science and Information Systems · Natural Language Processing Techniques