Learning the Travelling Salesperson Problem Requires Rethinking   Generalization

Chaitanya K. Joshi; Quentin Cappart; Louis-Martin Rousseau; Thomas; Laurent

arXiv:2006.07054·cs.LG·May 26, 2022

Learning the Travelling Salesperson Problem Requires Rethinking Generalization

Chaitanya K. Joshi, Quentin Cappart, Louis-Martin Rousseau, Thomas, Laurent

PDF

4 Repos

TL;DR

This paper investigates how neural network models for the Traveling Salesperson Problem can be improved to generalize better to larger, unseen instances by rethinking the entire optimization pipeline.

Contribution

It provides a comprehensive analysis of the factors affecting zero-shot generalization in neural TSP solvers and proposes guidelines for designing more scalable models.

Findings

01

Extrapolation to larger instances requires rethinking network design and training paradigms.

02

Certain model architectures and learning algorithms significantly improve generalization.

03

The study offers new directions for future research in neural combinatorial optimization.

Abstract

End-to-end training of neural network solvers for graph combinatorial optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently, but remain intractable and inefficient beyond graphs with few hundreds of nodes. While state-of-the-art learning-driven approaches for TSP perform closely to classical solvers when trained on trivially small sizes, they are unable to generalize the learnt policy to larger instances at practical scales. This work presents an end-to-end neural combinatorial optimization pipeline that unifies several recent papers in order to identify the inductive biases, model architectures and learning algorithms that promote generalization to instances larger than those seen in training. Our controlled experiments provide the first principled investigation into such zero-shot generalization, revealing that extrapolating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.