INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer
Han Fang, Zhihao Song, Paul Weng, Yutong Ban

TL;DR
INViT is a new deep learning architecture that improves the generalization of routing problem solvers across different distributions and scales by using invariant views and nested design.
Contribution
The paper introduces INViT, a novel transformer-based architecture with invariant nested views to enhance generalization in routing problem solvers.
Findings
INViT outperforms existing methods on TSP and CVRP with various distributions.
The architecture demonstrates strong scalability across different problem sizes.
Enhanced generalization is achieved through invariant views and data augmentation.
Abstract
Recently, deep reinforcement learning has shown promising results for learning fast heuristics to solve routing problems. Meanwhile, most of the solvers suffer from generalizing to an unseen distribution or distributions with different scales. To address this issue, we propose a novel architecture, called Invariant Nested View Transformer (INViT), which is designed to enforce a nested design together with invariant views inside the encoders to promote the generalizability of the learned solver. It applies a modified policy gradient algorithm enhanced with data augmentations. We demonstrate that the proposed INViT achieves a dominant generalization performance on both TSP and CVRP problems with various distributions and different problem scales.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optical Network Technologies · VLSI and FPGA Design Techniques · Vehicle Routing Optimization Methods
MethodsAttention Is All You Need · Absolute Position Encodings · Linear Layer · Byte Pair Encoding · Multi-Head Attention · Adam · Residual Connection · Layer Normalization · Dense Connections · Position-Wise Feed-Forward Layer
