Legal Documents Drafting with Fine-Tuned Pre-Trained Large Language Model
Chun-Hsien Lin, Pu-Jen Cheng

TL;DR
This paper demonstrates that fine-tuning pre-trained large language models on annotation-free legal documents enables effective legal document drafting while preserving privacy and security, even on local computers.
Contribution
It introduces a method to fine-tune large language models on unlabeled legal texts without segmentation, addressing data scarcity and privacy concerns.
Findings
Effective legal draft generation without manual annotations
Fine-tuning on local computers enhances privacy and security
Achieves comparable performance to traditional supervised methods
Abstract
With the development of large-scale Language Models (LLM), fine-tuning pre-trained LLM has become a mainstream paradigm for solving downstream tasks of natural language processing. However, training a language model in the legal field requires a large number of legal documents so that the language model can learn legal terminology and the particularity of the format of legal documents. The typical NLP approaches usually rely on many manually annotated data sets for training. However, in the legal field application, it is difficult to obtain a large number of manually annotated data sets, which restricts the typical method applied to the task of drafting legal documents. The experimental results of this paper show that not only can we leverage a large number of annotation-free legal documents without Chinese word segmentation to fine-tune a large-scale language model, but more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗RichardErkhov/jslin09_-_bloom-560m-finetuned-fraud-4bitsmodel· 4 dl4 dl
- 🤗RichardErkhov/jslin09_-_bloom-560m-finetuned-fraud-8bitsmodel· 1 dl1 dl
- 🤗RichardErkhov/jslin09_-_gemma2-2b-fraud-ggufmodel· 162 dl162 dl
- 🤗jslin09/llama-3.2-1b-fraudmodel· 1 dl1 dl
- 🤗RichardErkhov/jslin09_-_llama-3.2-1b-fraud-awqmodel
- 🤗RichardErkhov/jslin09_-_gemma2-2b-fraud-4bitsmodel
- 🤗RichardErkhov/jslin09_-_gemma2-2b-fraud-8bitsmodel
- 🤗RichardErkhov/jslin09_-_llama-3.2-1b-fraud-ggufmodel· 225 dl· ♡ 1225 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Natural Language Processing Techniques
