PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization

Andr\'e Hottung; Mridul Mahajan; Kevin Tierney

arXiv:2402.14048·cs.LG·October 7, 2025·1 cites

PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization

Andr\'e Hottung, Mridul Mahajan, Kevin Tierney

PDF

Open Access 1 Video 3 Reviews

TL;DR

PolyNet is a novel reinforcement learning approach that learns multiple complementary solution strategies with a single decoder, enhancing exploration and solution quality in combinatorial optimization problems.

Contribution

PolyNet introduces a single-decoder method that implicitly learns diverse strategies without handcrafted rules, improving exploration in combinatorial optimization.

Findings

01

PolyNet outperforms methods with explicit diversity enforcement.

02

It effectively explores the solution space across four problems.

03

PolyNet achieves better solutions than existing approaches.

Abstract

Reinforcement learning-based methods for constructing solutions to combinatorial optimization problems are rapidly approaching the performance of human-designed algorithms. To further narrow the gap, learning-based approaches must efficiently explore the solution space during the search process. Recent approaches artificially increase exploration by enforcing diverse solution generation through handcrafted rules, however, these rules can impair solution quality and are difficult to design for more complex problems. In this paper, we introduce PolyNet, an approach for improving exploration of the solution space by learning complementary solution strategies. In contrast to other works, PolyNet uses only a single-decoder and a training schema that does not enforce diverse solution generation through handcrafted rules. We evaluate PolyNet on four combinatorial optimization problems and…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 4

Strengths

- The use of a single-decoder model that can learn multiple strategies simplifies the training pipeline and reduces computational overhead compared to multi-decoder systems. - The model's training schema, which emphasizes inherent diversity, makes it adaptable to various CO problems beyond simple routing tasks. - The authors provide useful insights into the impact of their design choices, such as the effectiveness of not forcing diverse first moves.

Weaknesses

- While starting from pre-trained models boosts training efficiency, this reliance could limit applicability in cases where such pre-training is not feasible or available. - While PolyNet demonstrates strong results in routing tasks, its effectiveness in other CO problem domains (e.g., scheduling or knapsack) has not been deeply explored. - The paper does not discuss how PolyNet handles instances where input data is noisy or incomplete, a common occurrence in practical applications. - The imp

Reviewer 02Rating 3Confidence 5

Strengths

1. The paper is well-structured, presenting its content in a clear manner. 2. A wide range of experiments are conducted to assess solution diversity.

Weaknesses

1. The method of inserting additional vectors in the decoder is very similar to the idea of COMPASS [1], lacking sufficient innovation to significantly advance the field. 2. The description of bit vectors is minimal, and there is no detailed analysis of how varying vector representations might impact performance. 3. The fairness of comparing COMPASS with PolyNet+EAS in an experimental context is questionable. 4. In [1], there is a big difference between COMPASS and EAS runtimes, but in this pape

Reviewer 03Rating 6Confidence 4

Strengths

1. The authors have clearly articulated the framework and implementation of the method, with smooth writing. 2. The tables and figures are presented clearly, and the experiments are comprehensive. 3. Testing PolyNet on various combinatorial optimization problems (TSP, CVRP, CVRPTW, FFSP) demonstrates the good generality of the method. 4. (As claimed by the authors) their method performs exceptionally well on instances of scale 100, 200, and 300 (Tables 1-3).

Weaknesses

PolyNet can enhance the diversity and optimality of solution sets, but the authors seem to lack an in-depth discussion on the principles behind the additional layers of PolyNet. This leaves me somewhat puzzled. The training process described in the paper adopts a method similar to Poppy, and the added linear layers, activation functions, and residual connections are common components in the Transformer architecture. The motivation for significant improvements by merely concatenating an additiona

Videos

PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization· slideslive

Taxonomy

TopicsMetaheuristic Optimization Algorithms Research · Scheduling and Timetabling Solutions · BIM and Construction Integration