Self-Supervised Transformers as Iterative Solution Improvers for Constraint Satisfaction
Yudong W. Xu, Wenhao Li, Scott Sanner, Elias B. Khalil

TL;DR
This paper introduces ConsFormer, a self-supervised Transformer framework that iteratively refines solutions for various CSPs without requiring labeled data or large training budgets, demonstrating effectiveness on multiple problems.
Contribution
The paper proposes a novel self-supervised Transformer approach for CSPs that improves solutions iteratively, avoiding the need for feasible solutions or complex reward signals.
Findings
Effective on Sudoku, Graph Coloring, Nurse Rostering, and MAXCUT
Can handle out-of-distribution CSPs with additional iterations
Circumvents supervised and reinforcement learning limitations
Abstract
We present a Transformer-based framework for Constraint Satisfaction Problems (CSPs). CSPs find use in many applications and thus accelerating their solution with machine learning is of wide interest. Most existing approaches rely on supervised learning from feasible solutions or reinforcement learning, paradigms that require either feasible solutions to these NP-Complete CSPs or large training budgets and a complex expert-designed reward signal. To address these challenges, we propose ConsFormer, a self-supervised framework that leverages a Transformer as a solution refiner. ConsFormer constructs a solution to a CSP iteratively in a process that mimics local search. Instead of using feasible solutions as labeled data, we devise differentiable approximations to the discrete constraints of a CSP to guide model training. Our model is trained to improve random assignments for a single step…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsBIM and Construction Integration
MethodsAttention Is All You Need · Absolute Position Encodings · Linear Layer · Layer Normalization · Byte Pair Encoding · Dense Connections · Residual Connection · Label Smoothing · Multi-Head Attention · Position-Wise Feed-Forward Layer
