Reformulating Domain Adaptation of Large Language Models as   Adapt-Retrieve-Revise: A Case Study on Chinese Legal Domain

Zhen wan; Yating Zhang; Yexiang Wang; Fei Cheng; Sadao Kurohashi

arXiv:2310.03328·cs.CL·August 27, 2024

Reformulating Domain Adaptation of Large Language Models as Adapt-Retrieve-Revise: A Case Study on Chinese Legal Domain

Zhen wan, Yating Zhang, Yexiang Wang, Fei Cheng, Sadao Kurohashi

PDF

Open Access 1 Repo

TL;DR

This paper proposes Adapt-Retrieve-Revise, a domain adaptation framework for GPT-4 that enhances accuracy in Chinese legal tasks by combining a smaller adapted model with external evidence retrieval and revision, reducing hallucinations.

Contribution

It introduces a novel adaptation framework reformulating domain adaptation as an adapt-retrieve-revise process, effectively improving GPT-4's performance in specialized Chinese legal tasks.

Findings

01

Improves accuracy by 33.3% in zero-shot Chinese legal tasks.

02

Outperforms two stronger retrieval baselines by 15.4% and 23.9%.

03

Effectively reduces hallucinations in GPT-4 outputs.

Abstract

While large language models (LLMs) like GPT-4 have recently demonstrated astonishing zero-shot capabilities in general domain tasks, they often generate content with hallucinations in specific domains such as Chinese law, hindering their application in these areas. This is typically due to the absence of training data that encompasses such a specific domain, preventing GPT-4 from acquiring in-domain knowledge. A pressing challenge is that it's not plausible to continue training LLMs of such scale on in-domain data. This paper introduces a simple and effective domain adaptation framework for GPT-4 by reformulating generation as an \textbf{adapt-retrieve-revise} process. The initial step is to \textbf{adapt} an affordable 7B LLM to the target domain by continuing learning on in-domain data. When solving a task, we leverage the adapted LLM to generate a draft answer given a task query.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yukinowan/adapt-retrive-revise
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Dropout · Dense Connections · Linear Layer · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection · Layer Normalization