Reinforcement Learning Methods for Neighborhood Selection in Local Search
Yannick Molinghen, Augustin Delecluse, Renaud De Landtsheer, Stefano Michelini

TL;DR
This paper evaluates reinforcement learning strategies for neighborhood selection in local search algorithms across various combinatorial problems, highlighting their potential and practical limitations.
Contribution
It provides a comprehensive comparison of RL-based neighborhood selection methods, emphasizing the importance of reward design and analyzing their performance and computational costs.
Findings
$oldsymbol{ ext{ extit{ extbf{epsilon}}-greedy}}$ performs consistently well.
Deep RL methods are computationally intensive and less practical.
Reward function design critically affects learning stability.
Abstract
Reinforcement learning has recently gained traction as a means to improve combinatorial optimization methods, yet its effectiveness within local search metaheuristics specifically remains comparatively underexamined. In this study, we evaluate a range of reinforcement learning-based neighborhood selection strategies -- multi-armed bandits (upper confidence bound, -greedy) and deep reinforcement learning methods (proximal policy optimization, double deep -network) -- and compare them against multiple baselines across three different problems: the traveling salesman problem, the pickup and delivery problem with time windows, and the car sequencing problem. We show how search-specific characteristics, particularly large variations in cost due to constraint violation penalties, necessitate carefully designed reward functions to provide stable and informative learning signals.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetaheuristic Optimization Algorithms Research · Vehicle Routing Optimization Methods · Reinforcement Learning in Robotics
