Self-supervised Attribute-aware Dynamic Preference Ranking Alignment
Hongyu Yang, Qi Zhao, Zhenhua hu, Rui Li

TL;DR
This paper introduces SeAdpra, a self-supervised, attribute-aware method for dynamic preference ranking that improves list-level alignment in response generation without relying on costly human annotations.
Contribution
It proposes a novel self-supervised approach using Attribute-Perceptual Distance Factors for fine-grained preference learning and introduces scalable evaluation metrics and a challenging dataset.
Findings
SeAdpra outperforms existing methods on multiple datasets.
It achieves better alignment with human preferences.
The approach demonstrates strong generalizability across domains.
Abstract
Reinforcement Learning from Human Feedback and its variants excel in aligning with human intentions to generate helpful, harmless, and honest responses. However, most of them rely on costly human-annotated pairwise comparisons for supervised alignment, which is not suitable for list-level scenarios, such as community question answering. Additionally, human preferences are influenced by multiple intrinsic factors in responses, leading to decision-making inconsistencies. Therefore, we propose \textbf{Se}lf-supervised \textbf{A}ttribute-aware \textbf{d}ynamic \textbf{p}reference \textbf{ra}nking, called \shortname. \ It quantifies preference differences between responses based on Attribute-Perceptual Distance Factors (APDF) and dynamically determines the list-wise alignment order. Furthermore, it achieves fine-grained preference difference learning and enables precise alignment with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Rough Sets and Fuzzy Logic
