RLAE: Reinforcement Learning-Assisted Ensemble for LLMs

Yuqian Fu; Yuanheng Zhu; Jiajun Chai; Guojun Yin; Wei Lin; Qichao Zhang; Dongbin Zhao

arXiv:2506.00439·cs.LG·June 3, 2025

RLAE: Reinforcement Learning-Assisted Ensemble for LLMs

Yuqian Fu, Yuanheng Zhu, Jiajun Chai, Guojun Yin, Wei Lin, Qichao Zhang, Dongbin Zhao

PDF

Open Access 1 Video

TL;DR

This paper introduces RLAE, a reinforcement learning-based framework that dynamically adjusts ensemble weights of LLMs considering context and intermediate states, significantly improving performance and generalization over fixed-weight methods.

Contribution

RLAE reformulates LLM ensembling as an MDP and employs RL agents to adaptively optimize ensemble weights based on input and output quality, a novel approach in this domain.

Findings

01

RLAE improves accuracy by up to 3.3% over traditional methods.

02

The framework generalizes well across diverse tasks without retraining.

03

RLAE achieves lower latency compared to fixed-weight ensemble methods.

Abstract

Ensembling large language models (LLMs) can effectively combine diverse strengths of different models, offering a promising approach to enhance performance across various tasks. However, existing methods typically rely on fixed weighting strategies that fail to adapt to the dynamic, context-dependent characteristics of LLM capabilities. In this work, we propose Reinforcement Learning-Assisted Ensemble for LLMs (RLAE), a novel framework that reformulates LLM ensemble through the lens of a Markov Decision Process (MDP). Our approach introduces a RL agent that dynamically adjusts ensemble weights by considering both input context and intermediate generation states, with the agent being trained using rewards that directly correspond to the quality of final outputs. We implement RLAE using both single-agent and multi-agent reinforcement learning algorithms ( $RLAE_{PPO}$ and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RLAE: Reinforcement Learning-Assisted Ensemble for LLMs· underline

Taxonomy

TopicsDigital Rights Management and Security · Semantic Web and Ontologies · Multi-Agent Systems and Negotiation

MethodsSparse Evolutionary Training