ARS: Automatic Routing Solver with Large Language Models

Kai Li; Fei Liu; Zhenkun Wang; Xialiang Tong; Xiongwei Han; Mingxuan Yuan; Qingfu Zhang

arXiv:2502.15359·cs.AI·May 20, 2025

ARS: Automatic Routing Solver with Large Language Models

Kai Li, Fei Liu, Zhenkun Wang, Xialiang Tong, Xiongwei Han, Mingxuan Yuan, Qingfu Zhang

PDF

1 Repo 3 Reviews

TL;DR

This paper introduces ARS, an LLM-based automatic routing solver that generates constraint-aware heuristics, significantly improving the efficiency and effectiveness of solving complex real-world vehicle routing problems.

Contribution

The paper presents RoutBench, a comprehensive VRP benchmark, and ARS, an LLM-powered solver that automatically creates heuristics for complex VRPs, addressing a wide range of real-world constraints.

Findings

01

ARS solves 91.67% of VRPs in benchmarks

02

ARS outperforms existing LLM-based methods and traditional solvers

03

Achieves at least 30% improvement across benchmarks

Abstract

Real-world Vehicle Routing Problems (VRPs) are characterized by a variety of practical constraints, making manual solver design both knowledge-intensive and time-consuming. Although there is increasing interest in automating the design of routing algorithms, existing research has explored only a limited array of VRP variants and fails to adequately address the complex and prevalent constraints encountered in real-world situations. To fill this gap, this paper introduces RoutBench, a benchmark of 1,000 VRP variants derived from 24 attributes, for evaluating the effectiveness of automatic routing solvers in addressing complex constraints. Along with RoutBench, we present the Automatic Routing Solver (ARS), which employs Large Language Model (LLM) agents to enhance a backbone algorithm framework by automatically generating constraint-aware heuristic code, based on problem descriptions and…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

- The numerical experiments convincingly show that combining RAG-based constraint code generation with existing heuristics outperforms the baseline approach of prompting LLMs directly. - The proposed approach of translating natural-language constraints into executable programs has strong potential to simplify the process of developing constraint-specific heuristic algorithms.

Weaknesses

- The ablation study shows that leveraging the constraint database significantly improves constraint satisfaction. However, in practical applications, problem constraints are not always well-studied or included in such databases. Thus, the generality of the proposed method may be limited when dealing with novel or previously unseen constraints. - Although the framework aims to extend existing local search heuristics to handle diverse constraints, the paper does not analyze the scalability of th

Reviewer 02Rating 4Confidence 4

Strengths

• The paper releases RoutBench: 1,000 VRP variants (each with NL description, data, and validation code), which represents a broad benchmark contribution. • The ablations show each component of the framework matters, indicating the effectiveness of the design.

Weaknesses

• The evaluation is based on the correctness/coverage of the per-instance validation code. It is not reliable enough to ensure that the generated program works for a class of VRP. If a checker under-specifies edge cases, SR can be overstated. • Best-Known Solutions (BKS) for RoutBench are produced by ARS itself under strict stops. It seems that this method cannot ensure the actual (near)optimal solution, and thus leads to a benchmark circularity risk. • The superior performance partly reflect

Reviewer 03Rating 4Confidence 4

Strengths

1. The paper's primary strength is the innovative design of the ARS framework, which intelligently separates the general solver backbone from the LLM-generated, problem-specific heuristic components. This is a clever and effective way to combine the reasoning power of LLMs with the proven search capabilities of metaheuristics. 2. The introduction of RoutBench is a major contribution in its own right. It provides a large-scale, diverse, and well-structured testbed for evaluating the generalizati

Weaknesses

1. While the paper proposes the ARS framework, its originality is limited. The framework's RAG component utilizes existing technology, the "checker" and "scorer" steps are based on established ideas from heuristic VRP solvers, and the subsequent heuristic algorithm is also a pre-existing method. Overall, the lack of substantial novel content is the paper's most significant weakness. 2. The paper relies on a single-point-based search framework. It is unclear how the LLM-generated Constraint-Awar

Code & Models

Repositories

Ahalikai/ARS-Routbench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.