Efficient Model-agnostic Alignment via Bayesian Persuasion

Fengshuo Bai; Mingzhi Wang; Zhaowei Zhang; Boyuan Chen; Yinda Xu; Ying; Wen; Yaodong Yang

arXiv:2405.18718·cs.CL·May 30, 2024

Efficient Model-agnostic Alignment via Bayesian Persuasion

Fengshuo Bai, Mingzhi Wang, Zhaowei Zhang, Boyuan Chen, Yinda Xu, Ying, Wen, Yaodong Yang

PDF

Open Access

TL;DR

This paper introduces a lightweight, model-agnostic Bayesian Persuasion framework for aligning large language models efficiently using smaller models, reducing computational costs while improving performance across tasks.

Contribution

It formalizes the alignment as a signaling optimization problem and demonstrates theoretical and empirical improvements in LLM performance using the persuasion approach.

Findings

01

GPT-2 significantly improves performance on reasoning and code tasks.

02

The framework achieves an average 16.1% boost in mathematical reasoning.

03

Theoretical analysis confirms the effectiveness of the signaling strategy.

Abstract

With recent advancements in large language models (LLMs), alignment has emerged as an effective technique for keeping LLMs consensus with human intent. Current methods primarily involve direct training through Supervised Fine-tuning (SFT) or Reinforcement Learning from Human Feedback (RLHF), both of which require substantial computational resources and extensive ground truth data. This paper explores an efficient method for aligning black-box large models using smaller models, introducing a model-agnostic and lightweight Bayesian Persuasion Alignment framework. We formalize this problem as an optimization of the signaling strategy from the small model's perspective. In the persuasion process, the small model (Advisor) observes the information item (i.e., state) and persuades large models (Receiver) to elicit improved responses. The Receiver then generates a response based on the input,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Gaussian Processes and Bayesian Inference · Machine Learning and Data Classification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Layer Normalization · Weight Decay · Attention Dropout · Linear Layer · Linear Warmup With Cosine Annealing · Byte Pair Encoding · Adam