CLIP-RLDrive: Human-Aligned Autonomous Driving via CLIP-Based Reward   Shaping in Reinforcement Learning

Erfan Doroudian; Hamid Taghavifar

arXiv:2412.16201·cs.RO·December 24, 2024

CLIP-RLDrive: Human-Aligned Autonomous Driving via CLIP-Based Reward Shaping in Reinforcement Learning

Erfan Doroudian, Hamid Taghavifar

PDF

Open Access

TL;DR

This paper introduces CLIP-RLDrive, a reinforcement learning framework for autonomous driving that uses CLIP-based reward shaping to align vehicle decisions with human preferences in complex urban scenarios.

Contribution

It proposes a novel reward shaping method using CLIP to improve RL decision-making in autonomous vehicles, addressing reward design challenges.

Findings

01

Enhanced decision alignment with human preferences

02

Improved performance in complex urban driving scenarios

03

Effective use of CLIP for reward modeling

Abstract

This paper presents CLIP-RLDrive, a new reinforcement learning (RL)-based framework for improving the decision-making of autonomous vehicles (AVs) in complex urban driving scenarios, particularly in unsignalized intersections. To achieve this goal, the decisions for AVs are aligned with human-like preferences through Contrastive Language-Image Pretraining (CLIP)-based reward shaping. One of the primary difficulties in RL scheme is designing a suitable reward model, which can often be challenging to achieve manually due to the complexity of the interactions and the driving scenarios. To deal with this issue, this paper leverages Vision-Language Models (VLMs), particularly CLIP, to build an additional reward model based on visual and textual cues.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTransportation and Mobility Innovations · Autonomous Vehicle Technology and Safety

MethodsContrastive Language-Image Pre-training