Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of   Mobile Manipulators

Alexander Herzog; Kanishka Rao; Karol Hausman; Yao Lu; Paul Wohlhart,; Mengyuan Yan; Jessica Lin; Montserrat Gonzalez Arenas; Ted Xiao; Daniel; Kappler; Daniel Ho; Jarek Rettinghouse; Yevgen Chebotar; Kuang-Huei Lee,; Keerthana Gopalakrishnan; Ryan Julian; Adrian Li; Chuyuan Kelly Fu; Bob Wei,; Sangeetha Ramesh; Khem Holden; Kim Kleiven; David Rendleman; Sean Kirmani,; Jeff Bingham; Jon Weisz; Ying Xu; Wenlong Lu; Matthew Bennice; Cody Fong,; David Do; Jessica Lam; Yunfei Bai; Benjie Holson; Michael Quinlan; Noah; Brown; Mrinal Kalakrishnan; Julian Ibarz; Peter Pastor; Sergey Levine

arXiv:2305.03270·cs.RO·May 8, 2023·1 cites

Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators

Alexander Herzog, Kanishka Rao, Karol Hausman, Yao Lu, Paul Wohlhart,, Mengyuan Yan, Jessica Lin, Montserrat Gonzalez Arenas, Ted Xiao, Daniel, Kappler, Daniel Ho, Jarek Rettinghouse, Yevgen Chebotar, Kuang-Huei Lee,, Keerthana Gopalakrishnan, Ryan Julian, Adrian Li

PDF

Open Access

TL;DR

This paper presents a large-scale deep reinforcement learning system for robotic waste sorting in office buildings, leveraging real-world data, simulation bootstrapping, and auxiliary vision inputs to achieve broad generalization and effective deployment across multiple robots.

Contribution

It introduces a scalable system combining real-world and simulated training, with auxiliary vision inputs, validated through extensive experiments over 24 months and 23 robots.

Findings

01

Successful deployment of deep RL for waste sorting in real office environments.

02

Large-scale empirical validation with 9527 hours of data and 4800 evaluation trials.

03

Demonstrated generalization to novel objects and scalability with more data.

Abstract

We describe a system for deep reinforcement learning of robotic manipulation skills applied to a large-scale real-world task: sorting recyclables and trash in office buildings. Real-world deployment of deep RL policies requires not only effective training algorithms, but the ability to bootstrap real-world training and enable broad generalization. To this end, our system combines scalable deep RL from real-world data with bootstrapping from training in simulation, and incorporates auxiliary inputs from existing computer vision systems as a way to boost generalization to novel objects, while retaining the benefits of end-to-end training. We analyze the tradeoffs of different design decisions in our system, and present a large-scale empirical validation that includes training on real-world data gathered over the course of 24 months of experimentation, across a fleet of 23 robots in three…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics