Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation
Javad Seraj, Mohammad Mahdi Mohajeri, Mohammad Javad Dousti, Majid, Nili Ahmadabadi

TL;DR
This paper introduces a data augmentation method to improve personalized evaluation by open LLMs with limited data, achieving significant correlation and performance improvements in judgment tasks.
Contribution
It presents a novel data augmentation technique that enhances open LLMs' ability to align with human preferences in limited data scenarios.
Findings
7% improvement in Pearson correlation with reference judges
30% improvement over baseline in mathematical reasoning evaluation
Effective data selection enhances personalized evaluation accuracy
Abstract
Automatic evaluation by large language models (LLMs) is a prominent topic today; however, judgment and evaluation tasks are often subjective and influenced by various factors, making adaptation challenging. While many studies demonstrate the capabilities of state-of-the-art proprietary LLMs in comparison to human evaluators, they often struggle to adapt to reference evaluators over time, a requirement for achieving personalized judgment. Additionally, numerous works have attempted to apply open LLMs as judges or evaluators, but these efforts frequently overlook the limitations of working with scarce data. Personalized judgment is inherently associated with limited data scenarios, which are common in many real-world problems. Our work aims to present a data augmentation technique to select a more effective sample from limited data in order to align an open LLM with human preference. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsdemographic modeling and climate adaptation
MethodsBalanced Selection · ALIGN
