Reinforcement Learning Methods for Neighborhood Selection in Local Search

Yannick Molinghen; Augustin Delecluse; Renaud De Landtsheer; Stefano Michelini

arXiv:2601.07948·cs.LG·January 14, 2026

Reinforcement Learning Methods for Neighborhood Selection in Local Search

Yannick Molinghen, Augustin Delecluse, Renaud De Landtsheer, Stefano Michelini

PDF

Open Access

TL;DR

This paper evaluates reinforcement learning strategies for neighborhood selection in local search algorithms across various combinatorial problems, highlighting their potential and practical limitations.

Contribution

It provides a comprehensive comparison of RL-based neighborhood selection methods, emphasizing the importance of reward design and analyzing their performance and computational costs.

Findings

01

$oldsymbol{ ext{ extit{ extbf{epsilon}}-greedy}}$ performs consistently well.

02

Deep RL methods are computationally intensive and less practical.

03

Reward function design critically affects learning stability.

Abstract

Reinforcement learning has recently gained traction as a means to improve combinatorial optimization methods, yet its effectiveness within local search metaheuristics specifically remains comparatively underexamined. In this study, we evaluate a range of reinforcement learning-based neighborhood selection strategies -- multi-armed bandits (upper confidence bound, $ϵ$ -greedy) and deep reinforcement learning methods (proximal policy optimization, double deep $Q$ -network) -- and compare them against multiple baselines across three different problems: the traveling salesman problem, the pickup and delivery problem with time windows, and the car sequencing problem. We show how search-specific characteristics, particularly large variations in cost due to constraint violation penalties, necessitate carefully designed reward functions to provide stable and informative learning signals.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMetaheuristic Optimization Algorithms Research · Vehicle Routing Optimization Methods · Reinforcement Learning in Robotics