GEM: Generative Entropy-Guided Preference Modeling for Few-shot Alignment of LLMs
Yiyang Zhao, Huiyu Bai, Xuejiao Zhao

TL;DR
GEM introduces an entropy-guided, self-optimization approach for aligning large language models with human preferences in low-resource and domain-specific settings, reducing reliance on large annotated datasets.
Contribution
It proposes a novel generative entropy-guided preference modeling framework that trains LLMs to internalize preference signals without extensive supervision.
Findings
Significant performance improvements on benchmarks with few-shot preference data.
Effective alignment in domain-specific tasks like medical dialogues and mathematical reasoning.
Demonstrates the viability of entropy-based self-evaluation for LLM alignment.
Abstract
Alignment of large language models (LLMs) with human preferences typically relies on supervised reward models or external judges that demand abundant annotations. However, in fields that rely on professional knowledge, such as medicine and law, such large-scale preference labels are often unachievable. In this paper, we propose a generative entropy-guided preference modeling approach named GEM for LLMs aligment at low-resource and domain-specific scenarios. Instead of training a discriminative reward model on preference data, we directly train the LLM to internalize a closed-loop optimization architecture that can extract and exploit the multi-dimensional, fine-grained cognitive signals implicit in human preferences. Specifically, our Cognitive Filtering module, based on entropy theory in decision making, first leverages Chain-of-Thought (CoT) prompting to generate diverse candidate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
