Reinforcement Learning Driven Heuristic Optimization
Qingpeng Cai, Will Hang, Azalia Mirhoseini, George Tucker, Jingtao, Wang, Wei Wei

TL;DR
This paper introduces RLHO, a reinforcement learning framework that enhances heuristic algorithms for combinatorial optimization by generating better initial solutions, leading to improved performance on problems like bin packing.
Contribution
The paper presents a novel RL-based method to improve heuristic initializations, combining reinforcement learning with traditional heuristics for better optimization results.
Findings
RLHO outperforms baselines on bin packing
RL can learn to improve heuristic initial solutions
Combining RL with heuristics enhances optimization performance
Abstract
Heuristic algorithms such as simulated annealing, Concorde, and METIS are effective and widely used approaches to find solutions to combinatorial optimization problems. However, they are limited by the high sample complexity required to reach a reasonable solution from a cold-start. In this paper, we introduce a novel framework to generate better initial solutions for heuristic algorithms using reinforcement learning (RL), named RLHO. We augment the ability of heuristic algorithms to greedily improve upon an existing initial solution generated by RL, and demonstrate novel results where RL is able to leverage the performance of heuristics as a learning signal to generate better initialization. We apply this framework to Proximal Policy Optimization (PPO) and Simulated Annealing (SA). We conduct a series of experiments on the well-known NP-complete bin packing problem, and show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScheduling and Optimization Algorithms · Optimization and Packing Problems · Reinforcement Learning in Robotics
