Insights into Alignment: Evaluating DPO and its Variants Across Multiple   Tasks

Amir Saeidi; Shivanshu Verma; Md Nayem Uddin; Chitta Baral

arXiv:2404.14723·cs.CL·February 11, 2025

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Amir Saeidi, Shivanshu Verma, Md Nayem Uddin, Chitta Baral

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper evaluates DPO and its variants for aligning large language models with human preferences across multiple tasks, revealing their strengths and limitations in different configurations and training data sizes.

Contribution

It provides a comprehensive evaluation of DPO variants with various training setups across diverse benchmarks, highlighting their effectiveness and constraints.

Findings

01

Alignment methods perform well with small training data subsets.

02

Limited improvements on complex reasoning tasks.

03

Instruction tuning enhances model truthfulness.

Abstract

This study evaluates Direct Preference Optimization (DPO) and its variants for aligning Large Language Models (LLMs) with human preferences, testing three configurations: (1) with Supervised Fine Tuning (SFT), (2) without SFT, and (3) without SFT but using an instruction tuned model. We further investigate how training set size influences model performance. Our evaluation spans 13 benchmarks covering dialogue, reasoning, mathematical problem-solving, question answering, truthfulness, MT-Bench, Big Bench, and the Open LLM Leaderboard. We find that: (1) alignment methods often achieve near optimal performance even with smaller subsets of training data; (2) although they offer limited improvements on complex reasoning tasks, they enhance mathematical problem-solving; and (3) using an instruction tuned model improves truthfulness. These insights highlight the conditions under which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sahsaeedi/triple-preference-optimization
pytorch

Videos

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Business Process Modeling and Analysis

MethodsDirect Preference Optimization · Shrink and Fine-Tune