Robust RL with LLM-Driven Data Synthesis and Policy Adaptation for   Autonomous Driving

Sihao Wu; Jiaxu Liu; Xiangyu Yin; Guangliang Cheng; Xingyu Zhao; Meng; Fang; Xinping Yi; Xiaowei Huang

arXiv:2410.12568·cs.RO·October 22, 2024

Robust RL with LLM-Driven Data Synthesis and Policy Adaptation for Autonomous Driving

Sihao Wu, Jiaxu Liu, Xiangyu Yin, Guangliang Cheng, Xingyu Zhao, Meng, Fang, Xinping Yi, Xiaowei Huang

PDF

Open Access

TL;DR

This paper presents RAPID, a framework that leverages Large Language Models to synthesize data and distill robust, efficient RL policies for autonomous driving, improving adaptability and performance in real-time environments.

Contribution

Introducing RAPID, a novel method that combines LLM-driven data synthesis with RL distillation and online adaptation for robust autonomous driving policies.

Findings

01

RAPID achieves faster inference by distilling LLM knowledge into RL policies.

02

The framework enhances robustness and adaptability in autonomous driving tasks.

03

Extensive experiments validate RAPID's effectiveness in real-world scenarios.

Abstract

The integration of Large Language Models (LLMs) into autonomous driving systems demonstrates strong common sense and reasoning abilities, effectively addressing the pitfalls of purely data-driven methods. Current LLM-based agents require lengthy inference times and face challenges in interacting with real-time autonomous driving environments. A key open question is whether we can effectively leverage the knowledge from LLMs to train an efficient and robust Reinforcement Learning (RL) agent. This paper introduces RAPID, a novel \underline{\textbf{R}}obust \underline{\textbf{A}}daptive \underline{\textbf{P}}olicy \underline{\textbf{I}}nfusion and \underline{\textbf{D}}istillation framework, which trains specialized mix-of-policy RL agents using data synthesized by an LLM-based driving agent and online adaptation. RAPID features three key designs: 1) utilization of offline data collected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications