Efficient Finetuning Large Language Models For Vietnamese Chatbot

Vu-Thuan Doan; Quoc-Truong Truong; Duc-Vu Nguyen; Vinh-Tiep Nguyen,; and Thuy-Ngan Nguyen Luu

arXiv:2309.04646·cs.CL·September 12, 2023

Efficient Finetuning Large Language Models For Vietnamese Chatbot

Vu-Thuan Doan, Quoc-Truong Truong, Duc-Vu Nguyen, Vinh-Tiep Nguyen,, and Thuy-Ngan Nguyen Luu

PDF

Open Access

TL;DR

This paper introduces a cost-effective method for fine-tuning large language models to create Vietnamese chatbots, utilizing instruction datasets and parameter-efficient tuning, resulting in significant performance improvements.

Contribution

It presents the first Vietnamese instruction-following datasets and demonstrates effective fine-tuning of open-source LLMs using LoRA, improving chatbot performance by 20-30%.

Findings

01

20-30% performance improvement over original models

02

First Vietnamese instruction datasets created

03

Effective use of LoRA for cost-efficient tuning

Abstract

Large language models (LLMs), such as GPT-4, PaLM, and LLaMa, have been shown to achieve remarkable performance across a variety of natural language tasks. Recent advancements in instruction tuning bring LLMs with ability in following user's instructions and producing human-like responses. However, the high costs associated with training and implementing LLMs pose challenges to academic research. Furthermore, the availability of pretrained LLMs and instruction-tune datasets for Vietnamese language is limited. To tackle these concerns, we leverage large-scale instruction-following datasets from open-source projects, namely Alpaca, GPT4All, and Chat-Doctor, which cover general domain and specific medical domain. To the best of our knowledge, these are the first instructional dataset for Vietnamese. Subsequently, we utilize parameter-efficient tuning through Low-Rank Adaptation (LoRA) on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Adam · Byte Pair Encoding · Softmax · Dropout · Label Smoothing · Absolute Position Encodings