CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment
Feiteng Fang, Liang Zhu, Min Yang, Xi Feng, Jinchang Hou, Qixuan Zhao,, Chengming Li, Xiping Hu, Ruifeng Xu

TL;DR
This paper introduces CLHA, a straightforward contrastive learning framework that improves human alignment of large language models by dynamically assessing data quality and adjusting training, leading to superior alignment performance.
Contribution
The paper proposes a novel contrastive learning framework with a unique rescoring strategy and adaptive loss functions to enhance human preference alignment in LLMs, simplifying the training process.
Findings
CLHA outperforms existing algorithms in reward scores.
CLHA achieves higher automatic evaluation scores.
CLHA receives better human assessment ratings.
Abstract
Reinforcement learning from human feedback (RLHF) is a crucial technique in aligning large language models (LLMs) with human preferences, ensuring these LLMs behave in beneficial and comprehensible ways to users. However, a longstanding challenge in human alignment techniques based on reinforcement learning lies in their inherent complexity and difficulty in training. To address this challenge, we present a simple yet effective Contrastive Learning Framework for Human Alignment (CLHA) to align LLMs with human preferences directly. CLHA employs a novel rescoring strategy to evaluate the noise within the data by considering its inherent quality and dynamically adjusting the training process. Simultaneously, CLHA utilizes pairwise contrastive loss and adaptive supervised fine-tuning loss to adaptively modify the likelihood of generating responses, ensuring enhanced alignment with human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI · Technology Use by Older Adults
MethodsContrastive Learning · ALIGN
