Neural-Network-Driven Reward Prediction as a Heuristic: Advancing   Q-Learning for Mobile Robot Path Planning

Yiming Ji; Kaijie Yun; Yang Liu; Zongwu Xie; and Hong Liu

arXiv:2412.12650·cs.RO·December 18, 2024

Neural-Network-Driven Reward Prediction as a Heuristic: Advancing Q-Learning for Mobile Robot Path Planning

Yiming Ji, Kaijie Yun, Yang Liu, Zongwu Xie, and Hong Liu

PDF

Open Access

TL;DR

This paper introduces NDR-QL, a neural network-enhanced Q-learning approach that significantly accelerates convergence and improves path planning efficiency for mobile robots by using heuristic predictions.

Contribution

It proposes a novel neural network-based heuristic method, NDR-QL, that improves Q-learning convergence speed and path quality in mobile robot navigation.

Findings

01

NDR model achieves up to 5% higher prediction accuracy.

02

NDR-QL speeds up Q-learning convergence by 90%.

03

Outperforms previous Q-learning improvements in path quality.

Abstract

Q-learning is a widely used reinforcement learning technique for solving path planning problems. It primarily involves the interaction between an agent and its environment, enabling the agent to learn an optimal strategy that maximizes cumulative rewards. Although many studies have reported the effectiveness of Q-learning, it still faces slow convergence issues in practical applications. To address this issue, we propose the NDR-QL method, which utilizes neural network outputs as heuristic information to accelerate the convergence process of Q-learning. Specifically, we improved the dual-output neural network model by introducing a start-end channel separation mechanism and enhancing the feature fusion process. After training, the proposed NDR model can output a narrowly focused optimal probability distribution, referred to as the guideline, and a broadly distributed suboptimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Path Planning Algorithms · Robotics and Automated Systems

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Q-Learning