Multiple Weaks Win Single Strong: Large Language Models Ensemble Weak Reinforcement Learning Agents into a Supreme One

Yiwen Song; Qianyue Hao; Qingmin Liao; Jian Yuan; Yong Li

arXiv:2505.15306·cs.LG·May 22, 2025

Multiple Weaks Win Single Strong: Large Language Models Ensemble Weak Reinforcement Learning Agents into a Supreme One

Yiwen Song, Qianyue Hao, Qingmin Liao, Jian Yuan, Yong Li

PDF

Open Access

TL;DR

This paper introduces LLM-Ens, a task-specific semantic ensemble method for reinforcement learning that dynamically selects the best agent based on situation understanding, significantly improving performance on Atari benchmarks.

Contribution

We propose LLM-Ens, a novel semantic ensemble approach using large language models to adaptively select agents in RL tasks based on situation understanding.

Findings

01

LLM-Ens surpasses baseline ensemble methods by up to 20.9% on Atari.

02

Dynamic situation-based agent selection improves RL performance.

03

The method is compatible with various RL algorithms and hyperparameters.

Abstract

Model ensemble is a useful approach in reinforcement learning (RL) for training effective agents. Despite wide success of RL, training effective agents remains difficult due to the multitude of factors requiring careful tuning, such as algorithm selection, hyperparameter settings, and even random seed choices, all of which can significantly influence an agent's performance. Model ensemble helps overcome this challenge by combining multiple weak agents into a single, more powerful one, enhancing overall performance. However, existing ensemble methods, such as majority voting and Boltzmann addition, are designed as fixed strategies and lack a semantic understanding of specific tasks, limiting their adaptability and effectiveness. To address this, we propose LLM-Ens, a novel approach that enhances RL model ensemble with task-specific semantic understandings driven by large language models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics