Preference Alignment with Flow Matching
Minu Kim, Yongsik Lee, Sehyeok Kang, Jihwan Oh, Song Chong, Se-Young, Yun

TL;DR
Preference Flow Matching (PFM) introduces a novel preference-based reinforcement learning framework that directly learns from preference data using flow matching, avoiding extensive fine-tuning and reward modeling.
Contribution
PFM is the first to apply flow matching techniques to preference-based RL, enabling direct preference alignment without fine-tuning pre-trained models.
Findings
Effective preference alignment demonstrated in experiments
Reduces need for reward function estimation
Compatible with black-box models like GPT-4
Abstract
We present Preference Flow Matching (PFM), a new framework for preference-based reinforcement learning (PbRL) that streamlines the integration of preferences into an arbitrary class of pre-trained models. Existing PbRL methods require fine-tuning pre-trained models, which presents challenges such as scalability, inefficiency, and the need for model modifications, especially with black-box APIs like GPT-4. In contrast, PFM utilizes flow matching techniques to directly learn from preference data, thereby reducing the dependency on extensive fine-tuning of pre-trained models. By leveraging flow-based models, PFM transforms less preferred data into preferred outcomes, and effectively aligns model outputs with human preferences without relying on explicit or implicit reward function estimation, thus avoiding common issues like overfitting in reward models. We provide theoretical insights…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Stream Mining Techniques · Data Mining Algorithms and Applications · Time Series Analysis and Forecasting
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections
