CADO: From Imitation to Cost Minimization for Heatmap-based Solvers in Combinatorial Optimization

Hyungseok Song; Deunsol Yoon; Kanghoon Lee; Han-Seul Jeong; Soonyoung Lee; Woohyung Lim

arXiv:2602.08210·cs.LG·February 10, 2026

CADO: From Imitation to Cost Minimization for Heatmap-based Solvers in Combinatorial Optimization

Hyungseok Song, Deunsol Yoon, Kanghoon Lee, Han-Seul Jeong, Soonyoung Lee, Woohyung Lim

PDF

Open Access

TL;DR

This paper introduces CADO, a reinforcement learning framework that directly optimizes solution costs in heatmap-based combinatorial optimization, overcoming limitations of traditional imitation-based training methods.

Contribution

CADO presents a novel RL fine-tuning approach with label-centered rewards and hybrid adaptation, achieving state-of-the-art results in heatmap-based solvers.

Findings

01

CADO outperforms existing methods across multiple benchmarks.

02

Objective alignment significantly improves solution quality.

03

Reinforcement learning effectively addresses the limitations of imitation learning.

Abstract

Heatmap-based solvers have emerged as a promising paradigm for Combinatorial Optimization (CO). However, we argue that the dominant Supervised Learning (SL) training paradigm suffers from a fundamental objective mismatch: minimizing imitation loss (e.g., cross-entropy) does not guarantee solution cost minimization. We dissect this mismatch into two deficiencies: Decoder-Blindness (being oblivious to the non-differentiable decoding process) and Cost-Blindness (prioritizing structural imitation over solution quality). We empirically demonstrate that these intrinsic flaws impose a hard performance ceiling. To overcome this limitation, we propose CADO (Cost-Aware Diffusion models for Optimization), a streamlined Reinforcement Learning fine-tuning framework that formulates the diffusion denoising process as an MDP to directly optimize the post-decoded solution cost. We introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Multi-Objective Optimization Algorithms · Reinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques