Extending NGU to Multi-Agent RL: A Preliminary Study

Juan Hernandez; Diego Fern\'andez; Manuel Cifuentes; Denis Parra; Rodrigo Toro Icarte

arXiv:2512.01321·cs.AI·December 2, 2025

Extending NGU to Multi-Agent RL: A Preliminary Study

Juan Hernandez, Diego Fern\'andez, Manuel Cifuentes, Denis Parra, Rodrigo Toro Icarte

PDF

Open Access

TL;DR

This paper extends the NGU reinforcement learning algorithm to multi-agent environments, demonstrating that shared experience buffers and intrinsic exploration enhance performance and stability in multi-agent tasks.

Contribution

It introduces a multi-agent extension of NGU, evaluates design choices like shared buffers and novelty sharing, and provides insights into effective configurations for multi-agent RL.

Findings

01

Shared replay buffer improves performance and stability.

02

Sharing episodic novelty is effective at k=1 but degrades at higher k.

03

Heterogeneous beta values do not outperform a small common value.

Abstract

The Never Give Up (NGU) algorithm has proven effective in reinforcement learning tasks with sparse rewards by combining episodic novelty and intrinsic motivation. In this work, we extend NGU to multi-agent environments and evaluate its performance in the simple_tag environment from the PettingZoo suite. Compared to a multi-agent DQN baseline, NGU achieves moderately higher returns and more stable learning dynamics. We investigate three design choices: (1) shared replay buffer versus individual replay buffers, (2) sharing episodic novelty among agents using different k thresholds, and (3) using heterogeneous values of the beta parameter. Our results show that NGU with a shared replay buffer yields the best performance and stability, highlighting that the gains come from combining NGU intrinsic exploration with experience sharing. Novelty sharing performs comparably when k = 1 but…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Neural Networks and Reservoir Computing