Combinatorial Optimization with Policy Adaptation using Latent Space Search
Felix Chalumeau, Shikha Surana, Clement Bonnet, Nathan Grinsztajn,, Arnu Pretorius, Alexandre Laterre, Thomas D. Barrett

TL;DR
This paper introduces COMPASS, an RL-based method that learns a diverse set of policies in a latent space, enabling more effective combinatorial optimization solutions that outperform existing methods on standard benchmarks and generalize well to new problem distributions.
Contribution
The paper presents COMPASS, a novel reinforcement learning approach that models a distribution of policies in a latent space for improved combinatorial optimization performance.
Findings
Outperforms state-of-the-art on 11 benchmark tasks
Generalizes better to procedurally transformed instances
Effective across TSP, VRP, and Job-Shop Scheduling
Abstract
Combinatorial Optimization underpins many real-world applications and yet, designing performant algorithms to solve these complex, typically NP-hard, problems remains a significant research challenge. Reinforcement Learning (RL) provides a versatile framework for designing heuristics across a broad spectrum of problem domains. However, despite notable progress, RL has not yet supplanted industrial solvers as the go-to solution. Current approaches emphasize pre-training heuristics that construct solutions but often rely on search procedures with limited variance, such as stochastically sampling numerous solutions from a single policy or employing computationally expensive fine-tuning of the policy on individual problem instances. Building on the intuition that performant search at inference time should be anticipated during pre-training, we propose COMPASS, a novel RL approach that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Vehicle Routing Optimization Methods · Metaheuristic Optimization Algorithms Research
MethodsSparse Evolutionary Training
