The Sandbox Environment for Generalizable Agent Research (SEGAR)
R Devon Hjelm, Bogdan Mazoure, Florian Golemo, Samira Ebrahimi Kahou,, Pedro Braga, Felipe Frujeri, Mihai Jalobeanu, Andrey Kolobov

TL;DR
SEGAR is a customizable, easy-to-use benchmark environment designed to facilitate research on generalization in reinforcement learning by allowing precise control over task distributions and measuring generalization performance.
Contribution
The paper introduces SEGAR, a flexible environment that improves the design, implementation, and evaluation of generalization benchmarks in RL research.
Findings
SEGAR enables easy specification of task distributions.
SEGAR facilitates measuring generalization performance.
Experiments demonstrate SEGAR's utility in various research questions.
Abstract
A broad challenge of research on generalization for sequential decision-making tasks in interactive environments is designing benchmarks that clearly landmark progress. While there has been notable headway, current benchmarks either do not provide suitable exposure nor intuitive control of the underlying factors, are not easy-to-implement, customizable, or extensible, or are computationally expensive to run. We built the Sandbox Environment for Generalizable Agent Research (SEGAR) with all of these things in mind. SEGAR improves the ease and accountability of generalization research in RL, as generalization objectives can be easy designed by specifying task distributions, which in turns allows the researcher to measure the nature of the generalization objective. We present an overview of SEGAR and how it contributes to these goals, as well as experiments that demonstrate a few types of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Multi-Agent Systems and Negotiation · Reinforcement Learning in Robotics
