Learning Reward Models for Cooperative Trajectory Planning with Inverse   Reinforcement Learning and Monte Carlo Tree Search

Karl Kurzer; Matthias Bitzer; J. Marius Z\"ollner

arXiv:2202.06443·cs.LG·November 15, 2022

Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree Search

Karl Kurzer, Matthias Bitzer, J. Marius Z\"ollner

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method combining inverse reinforcement learning and Monte Carlo Tree Search to learn reward models for cooperative trajectory planning, enabling automated vehicles to behave more human-like in traffic scenarios.

Contribution

It presents a novel approach that integrates feature-based maximum entropy inverse reinforcement learning with Monte Carlo Tree Search for reward modeling in cooperative planning.

Findings

01

Successfully recovers reward models that mimic expert trajectories.

02

Performs comparably to manually tuned baseline reward models.

03

Enhances predictability of automated vehicle behavior in traffic.

Abstract

Cooperative trajectory planning methods for automated vehicles can solve traffic scenarios that require a high degree of cooperation between traffic participants. However, for cooperative systems to integrate into human-centered traffic, the automated systems must behave human-like so that humans can anticipate the system's decisions. While Reinforcement Learning has made remarkable progress in solving the decision-making part, it is non-trivial to parameterize a reward model that yields predictable actions. This work employs feature-based Maximum Entropy Inverse Reinforcement Learning combined with Monte Carlo Tree Search to learn reward models that maximize the likelihood of recorded multi-agent cooperative expert trajectories. The evaluation demonstrates that the approach can recover a reasonable reward model that mimics the expert and performs similarly to a manually tuned baseline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ProSeCo-Planning/ros_proseco_planning/tree/main/python/proseco/inverse_reinforcement_learning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Traffic control and management · Human-Automation Interaction and Safety