Neural Combinatorial Optimization with Reinforcement Learning
Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, Samy Bengio

TL;DR
This paper introduces a neural network-based reinforcement learning framework to solve combinatorial optimization problems like TSP and Knapsack, achieving near-optimal solutions on large instances without extensive heuristics.
Contribution
It demonstrates that neural networks trained with reinforcement learning can effectively solve complex combinatorial problems, providing a general approach that outperforms traditional heuristics.
Findings
Achieves near-optimal solutions for TSP with up to 100 nodes.
Obtains optimal solutions for Knapsack with up to 200 items.
Requires minimal engineering and heuristic design.
Abstract
This paper presents a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning. We focus on the traveling salesman problem (TSP) and train a recurrent network that, given a set of city coordinates, predicts a distribution over different city permutations. Using negative tour length as the reward signal, we optimize the parameters of the recurrent network using a policy gradient method. We compare learning the network parameters on a set of training graphs against learning them on individual test graphs. Despite the computational expense, without much engineering and heuristic designing, Neural Combinatorial Optimization achieves close to optimal results on 2D Euclidean graphs with up to 100 nodes. Applied to the KnapSack, another NP-hard problem, the same method obtains optimal solutions for instances with up to 200 items.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetaheuristic Optimization Algorithms Research
