Improving Conversational Abilities of Quantized Large Language Models   via Direct Preference Alignment

Janghwan Lee; Seongmin Park; Sukjin Hong; Minsoo Kim; Du-Seong Chang,; Jungwook Choi

arXiv:2407.03051·cs.CL·July 19, 2024

Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment

Janghwan Lee, Seongmin Park, Sukjin Hong, Minsoo Kim, Du-Seong Chang,, Jungwook Choi

PDF

Open Access

TL;DR

This paper introduces QDPO, a novel method that enhances the conversational abilities of quantized large language models by aligning them with full-precision models, addressing efficiency-performance trade-offs.

Contribution

The paper presents a new quantization-aware preference optimization technique that improves conversational performance of quantized LLMs beyond existing methods.

Findings

01

QDPO outperforms PTQ and knowledge distillation in conversational tasks.

02

Evaluations on multiple languages show broad effectiveness.

03

Improves efficiency without sacrificing conversational quality.

Abstract

The rapid advancement of large language models (LLMs) has facilitated their transformation into conversational chatbots that can grasp contextual nuances and generate pertinent sentences, closely mirroring human values through advanced techniques such as instruction tuning and reinforcement learning from human feedback (RLHF). However, the computational efficiency required for LLMs, achieved through techniques like post-training quantization (PTQ), presents challenges such as token-flipping that can impair chatbot performance. In response, we propose a novel preference alignment approach, quantization-aware direct preference optimization (QDPO), that aligns quantized LLMs with their full-precision counterparts, improving conversational abilities. Evaluated on two instruction-tuned LLMs in various languages, QDPO demonstrated superior performance in improving conversational abilities…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems