Adapting Chat Language Models Using Only Target Unlabeled Language Data
Atsuki Yamaguchi, Terufumi Morishita, Aline Villavicencio, Nikolaos Aletras

TL;DR
ElChat is a novel method for adapting chat language models directly on unlabeled target data, avoiding the need for a base model and enhancing language, safety, and instruction-following capabilities.
Contribution
ElChat introduces a new approach that directly adapts chat models on unlabeled data, outperforming previous methods that rely on base models and weight differences.
Findings
ElChat achieves superior performance in language adaptation.
It maintains robust chat abilities and safety standards.
Outperforms previous conversion-based methods.
Abstract
Vocabulary expansion (VE) is the de-facto approach to language adaptation of large language models (LLMs) by adding new tokens and continuing pre-training on target data. While this is effective for base models trained on unlabeled data, it poses challenges for chat models trained to follow instructions through labeled conversation data. Directly adapting the latter with VE on target unlabeled data may result in forgetting chat abilities. While ideal, target chat data is often unavailable or costly to create for low-resource languages, and machine-translated alternatives are not always effective. To address this issue, previous work proposed using a base and chat model from the same family. This method first adapts the base LLM with VE on target unlabeled data and then converts it to a chat model by adding a chat vector (CV) derived from the weight difference between the source base and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗atsuki-yamaguchi/Qwen2.5-7B-Instruct-bn-madlad-mean-tunedmodel· 1 dl1 dl
- 🤗atsuki-yamaguchi/Llama-3.1-8B-Instruct-am-madlad-mean-tunedmodel· 1 dl1 dl
- 🤗atsuki-yamaguchi/Qwen2.5-7B-Instruct-te-lapt-madladmodel
- 🤗atsuki-yamaguchi/Qwen2.5-7B-Instruct-ta-madlad-mean-tunedmodel· 3 dl3 dl
- 🤗atsuki-yamaguchi/Qwen2.5-7B-Instruct-ta-lapt-madladmodel· 2 dl2 dl
- 🤗atsuki-yamaguchi/Llama-3.1-8B-Instruct-bn-madlad-mean-tunedmodel· 1 dl1 dl
- 🤗atsuki-yamaguchi/Qwen2.5-7B-Instruct-si-lapt-madladmodel
- 🤗atsuki-yamaguchi/Llama-3.1-8B-Instruct-gu-madlad-mean-tunedmodel· 1 dl1 dl
- 🤗atsuki-yamaguchi/Llama-3.1-8B-Instruct-my-madlad-mean-tunedmodel· 3 dl3 dl
- 🤗atsuki-yamaguchi/Qwen2.5-7B-Instruct-bn-lapt-madladmodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsBalanced Selection
