Reward Shaping with Subgoals for Social Navigation

Takato Okudo; Seiji Yamada

arXiv:2104.06410·cs.RO·April 15, 2021

Reward Shaping with Subgoals for Social Navigation

Takato Okudo, Seiji Yamada

PDF

Open Access

TL;DR

This paper introduces a reward shaping method using subgoals to accelerate reinforcement learning in social navigation tasks, enabling robots to learn efficient, collision-free navigation behaviors faster in environments with unpredictable humans.

Contribution

The paper proposes a novel reward shaping approach with subgoals that enhances learning efficiency in social navigation reinforcement learning tasks.

Findings

01

Improved learning speed over baseline algorithms.

02

Effective collision avoidance in social navigation scenarios.

03

Faster policy acquisition in environments with humans.

Abstract

Social navigation has been gaining attentions with the growth in machine intelligence. Since reinforcement learning can select an action in the prediction phase at a low computational cost, it has been formulated in a social navigation tasks. However, reinforcement learning takes an enormous number of iterations until acquiring a behavior policy in the learning phase. This negatively affects the learning of robot behaviors in the real world. In particular, social navigation includes humans who are unpredictable moving obstacles in an environment. We proposed a reward shaping method with subgoals to accelerate learning. The main part is an aggregation method that use subgoals to shape a reinforcement learning algorithm. We performed a learning experiment with a social navigation task in which a robot avoided collisions and then reached its goal. The experimental results show that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvacuation and Crowd Dynamics · Reinforcement Learning in Robotics · Social Robot Interaction and HRI