Efficient Decoder Scaling Strategy for Neural Routing Solvers

Qing Luo; Fu Luo; Ke Li; Zhenkun Wang

arXiv:2603.00430·cs.LG·March 3, 2026

Efficient Decoder Scaling Strategy for Neural Routing Solvers

Qing Luo, Fu Luo, Ke Li, Zhenkun Wang

PDF

Open Access

TL;DR

This paper systematically investigates how scaling depth versus width in neural routing decoders affects performance, revealing that increasing depth is more beneficial than width for vehicle routing problems.

Contribution

It introduces a comprehensive analysis of decoder scaling strategies, demonstrating the advantages of depth scaling over width scaling in neural routing models.

Findings

01

Scaling depth improves performance more than scaling width.

02

Parameter count alone does not predict model effectiveness.

03

Design principles for efficient model scaling are validated.

Abstract

Construction-based neural routing solvers, typically composed of an encoder and a decoder, have emerged as a promising approach for solving vehicle routing problems. While recent studies suggest that shifting parameters from the encoder to the decoder enhances performance, most works restrict the decoder size to 1-3M parameters, leaving the effects of scaling largely unexplored. To address this gap, we conduct a systematic study comparing two distinct strategies: scaling depth versus scaling width. We synthesize these strategies to construct a suite of 12 model configurations, spanning a parameter range from 1M to ~150M, and extensively evaluate their scaling behaviors across three critical dimensions: parameter efficiency, data efficiency, and compute efficiency. Our empirical results reveal that parameter count is insufficient to accurately predict the model performance, highlighting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVLSI and FPGA Design Techniques · Advanced Neural Network Applications · Vehicle Routing Optimization Methods