CLHA: A Simple yet Effective Contrastive Learning Framework for Human   Alignment

Feiteng Fang; Liang Zhu; Min Yang; Xi Feng; Jinchang Hou; Qixuan Zhao,; Chengming Li; Xiping Hu; Ruifeng Xu

arXiv:2403.16649·cs.AI·March 27, 2024·1 cites

CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment

Feiteng Fang, Liang Zhu, Min Yang, Xi Feng, Jinchang Hou, Qixuan Zhao,, Chengming Li, Xiping Hu, Ruifeng Xu

PDF

Open Access 1 Repo

TL;DR

This paper introduces CLHA, a straightforward contrastive learning framework that improves human alignment of large language models by dynamically assessing data quality and adjusting training, leading to superior alignment performance.

Contribution

The paper proposes a novel contrastive learning framework with a unique rescoring strategy and adaptive loss functions to enhance human preference alignment in LLMs, simplifying the training process.

Findings

01

CLHA outperforms existing algorithms in reward scores.

02

CLHA achieves higher automatic evaluation scores.

03

CLHA receives better human assessment ratings.

Abstract

Reinforcement learning from human feedback (RLHF) is a crucial technique in aligning large language models (LLMs) with human preferences, ensuring these LLMs behave in beneficial and comprehensible ways to users. However, a longstanding challenge in human alignment techniques based on reinforcement learning lies in their inherent complexity and difficulty in training. To address this challenge, we present a simple yet effective Contrastive Learning Framework for Human Alignment (CLHA) to align LLMs with human preferences directly. CLHA employs a novel rescoring strategy to evaluate the noise within the data by considering its inherent quality and dynamically adjusting the training process. Simultaneously, CLHA utilizes pairwise contrastive loss and adaptive supervised fine-tuning loss to adaptively modify the likelihood of generating responses, ensuring enhanced alignment with human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

calubkk/clha
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSocial Robot Interaction and HRI · Technology Use by Older Adults

MethodsContrastive Learning · ALIGN