Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization
Quentin Cappart, Thierry Moisan, Louis-Martin Rousseau and, Isabeau Pr\'emont-Schwarz, Andre Cire

TL;DR
This paper introduces a hybrid approach combining deep reinforcement learning and constraint programming, using a dynamic programming formulation to effectively solve complex combinatorial optimization problems like TSP with time windows and portfolio optimization.
Contribution
It presents a novel hybrid framework that integrates DRL and CP via dynamic programming, overcoming limitations of existing methods and enabling systematic improvements and optimality proofs.
Findings
Outperforms standalone RL and CP solutions.
Efficiently solves TSP with time windows and portfolio optimization.
Competitive with industrial solvers.
Abstract
Combinatorial optimization has found applications in numerous fields, from aerospace to transportation planning and economics. The goal is to find an optimal solution among a finite set of possibilities. The well-known challenge one faces with combinatorial optimization is the state-space explosion problem: the number of possibilities grows exponentially with the problem size, which makes solving intractable for large problems. In the last years, deep reinforcement learning (DRL) has shown its promise for designing good heuristics dedicated to solve NP-hard combinatorial optimization problems. However, current approaches have two shortcomings: (1) they mainly focus on the standard travelling salesman problem and they cannot be easily extended to other problems, and (2) they only provide an approximate solution with no systematic ways to improve it or to prove optimality. In another…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsConstraint Satisfaction and Optimization
