Sample-Efficient, Exploration-Based Policy Optimisation for Routing   Problems

Nasrin Sultana; Jeffrey Chan; Tabinda Sarwar; A. K. Qin

arXiv:2205.15656·cs.LG·June 1, 2022·1 cites

Sample-Efficient, Exploration-Based Policy Optimisation for Routing Problems

Nasrin Sultana, Jeffrey Chan, Tabinda Sarwar, A. K. Qin

PDF

Open Access

TL;DR

This paper introduces a sample-efficient, exploration-based reinforcement learning method using entropy maximization and off-policy techniques to improve solution quality and speed in routing problems like TSP and VRP.

Contribution

It proposes a novel entropy-based, off-policy reinforcement learning approach that enhances sample efficiency and generalizes across various routing problems.

Findings

01

Outperforms state-of-the-art methods in solution quality.

02

Reduces computation time significantly.

03

Generalizes to different routing problem sizes.

Abstract

Model-free deep-reinforcement-based learning algorithms have been applied to a range of COPs~\cite{bello2016neural}~\cite{kool2018attention}~\cite{nazari2018reinforcement}. However, these approaches suffer from two key challenges when applied to combinatorial problems: insufficient exploration and the requirement of many training examples of the search space to achieve reasonable performance. Combinatorial optimisation can be complex, characterised by search spaces with many optimas and large spaces to search and learn. Therefore, a new method is needed to find good solutions that are more efficient by being more sample efficient. This paper presents a new reinforcement learning approach that is based on entropy. In addition, we design an off-policy-based reinforcement learning technique that maximises the expected return and improves the sample efficiency to achieve faster learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVehicle Routing Optimization Methods · Robotic Path Planning Algorithms · Smart Parking Systems Research