BLADE: Enhancing Black-box Large Language Models with Small   Domain-Specific Models

Haitao Li; Qingyao Ai; Jia Chen; Qian Dong; Zhijing Wu; Yiqun Liu,; Chong Chen; Qi Tian

arXiv:2403.18365·cs.CL·March 28, 2024·1 cites

BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models

Haitao Li, Qingyao Ai, Jia Chen, Qian Dong, Zhijing Wu, Yiqun Liu,, Chong Chen, Qi Tian

PDF

Open Access 1 Video

TL;DR

BLADE is a framework that enhances black-box large language models with small, domain-specific models to improve performance in specialized fields like legal and medical domains, without extensive retraining.

Contribution

BLADE introduces a novel method combining a small domain-specific LM with a general LLM via Bayesian optimization, offering a cost-effective way to adapt LLMs for vertical domains.

Findings

01

Outperforms existing domain adaptation methods on legal and medical benchmarks.

02

Significantly improves domain-specific task accuracy with minimal additional training.

03

Demonstrates cost efficiency compared to continuous pre-training or retrieval augmentation.

Abstract

Large Language Models (LLMs) like ChatGPT and GPT-4 are versatile and capable of addressing a diverse range of tasks. However, general LLMs, which are developed on open-domain data, may lack the domain-specific knowledge essential for tasks in vertical domains, such as legal, medical, etc. To address this issue, previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs. Unfortunately, these strategies are either cost-intensive or unreliable in practical applications. To this end, we present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models. BLADE consists of a black-box LLM and a small domain-specific LM. The small LM preserves domain-specific knowledge and offers specialized insights, while the general LLM contributes robust language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

BLADE: Enhancing Black-Box Large Language Models with Small Domain-Specific Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Attention Is All You Need · Layer Normalization · Byte Pair Encoding · Softmax · Dropout · Multi-Head Attention