MAPLE: A Framework for Active Preference Learning Guided by Large   Language Models

Saaduddin Mahmud; Mason Nakamura; Shlomo Zilberstein

arXiv:2412.07207·cs.LG·December 23, 2024

MAPLE: A Framework for Active Preference Learning Guided by Large Language Models

Saaduddin Mahmud, Mason Nakamura, Shlomo Zilberstein

PDF

Open Access 1 Video

TL;DR

MAPLE is a novel framework that uses large language models to guide Bayesian active preference learning, reducing human effort and improving learning efficiency through natural language feedback and active query selection.

Contribution

It introduces a new LLM-guided Bayesian active preference learning framework that models preference functions and optimizes query selection for efficiency and interpretability.

Findings

01

MAPLE accelerates preference learning compared to baseline methods.

02

It reduces human supervision effort in preference elicitation.

03

MAPLE achieves high-quality preference inference on real-world benchmarks.

Abstract

The advent of large language models (LLMs) has sparked significant interest in using natural language for preference learning. However, existing methods often suffer from high computational burdens, taxing human supervision, and lack of interpretability. To address these issues, we introduce MAPLE, a framework for large language model-guided Bayesian active preference learning. MAPLE leverages LLMs to model the distribution over preference functions, conditioning it on both natural language feedback and conventional preference learning feedback, such as pairwise trajectory rankings. MAPLE also employs active learning to systematically reduce uncertainty in this distribution and incorporates a language-conditioned active query selection mechanism to identify informative and easy-to-answer queries, thus reducing human burden. We evaluate MAPLE's sample efficiency and preference inference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MAPLE: A Framework for Active Preference Learning Guided by Large Language Models· underline

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Speech and dialogue systems