Multi-Response Preference Optimization with Augmented Ranking Dataset

Hansle Gwon; Imjin Ahn; Young-Hak Kim; Sanghyun Park; Tae Joon Jun

arXiv:2412.07812·cs.CL·December 12, 2024

Multi-Response Preference Optimization with Augmented Ranking Dataset

Hansle Gwon, Imjin Ahn, Young-Hak Kim, Sanghyun Park, Tae Joon Jun

PDF

Open Access

TL;DR

This paper introduces a novel dataset augmentation method and a multi-response training approach for preference optimization in LLMs, enhancing their ability to learn from multiple human preferences simultaneously.

Contribution

It proposes a new dataset augmentation technique and a multi-response training method for preference optimization, addressing dataset quality sensitivity and enabling multi-response learning.

Findings

01

Improved performance in preference optimization tasks.

02

Effective learning of multiple responses simultaneously.

03

Enhanced dataset robustness and quality.

Abstract

Recent advancements in Large Language Models (LLMs) have been remarkable, with new models consistently surpassing their predecessors. These advancements are underpinned by extensive research on various training mechanisms. Among these, Preference Optimization has played a significant role in improving the performance of LLMs by incorporating human preferences into the training process. However, constructing preference optimization datasets is challenging and the optimization process is highly sensitive to the dataset quality. In this study, we propose a novel approach to augment Preference Optimization datasets. Additionally, we introduce a Multi-response-based Preference Optimization training method that enables the simultaneous learning of multiple responses.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTechnology and Data Analysis · Internet of Things and Social Network Interactions · Korean Urban and Social Studies