Swarm Behavior Cloning

Jonas N\"u{\ss}lein; Maximilian Zorn; Philipp Altmann; Claudia; Linnhoff-Popien

arXiv:2412.07617·cs.AI·December 11, 2024

Swarm Behavior Cloning

Jonas N\"u{\ss}lein, Maximilian Zorn, Philipp Altmann, Claudia, Linnhoff-Popien

PDF

Open Access

TL;DR

This paper introduces a method to improve ensemble behavior cloning by reducing action differences among policies, leading to enhanced robustness and performance across various environments.

Contribution

It proposes a novel approach to align ensemble policies in behavior cloning, addressing action discrepancy issues and maintaining diversity for better decision-making.

Findings

01

Reduced action differences among ensemble policies.

02

Improved mean episode returns across eight environments.

03

Enhanced robustness and decision-making diversity.

Abstract

In sequential decision-making environments, the primary approaches for training agents are Reinforcement Learning (RL) and Imitation Learning (IL). Unlike RL, which relies on modeling a reward function, IL leverages expert demonstrations, where an expert policy $π_{e}$ (e.g., a human) provides the desired behavior. Formally, a dataset $D$ of state-action pairs is provided: $D = (s, a = π_{e} (s))$ . A common technique within IL is Behavior Cloning (BC), where a policy $π (s) = a$ is learned through supervised learning on $D$ . Further improvements can be achieved by using an ensemble of $N$ individually trained BC policies, denoted as $E = π_{i} (s) 1 \leq i \leq N$ . The ensemble's action $a$ for a given state $s$ is the aggregated output of the $N$ actions: $a = \frac{1}{N} \sum i π_{i} (s)$ . This paper addresses the issue of increasing action differences -- the observation that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Chemical Sensor Technologies