Target-Aware Data Augmentation for SAT Prediction

Eshed Gal; Uri Ascher; Eldad Haber

arXiv:2605.06931·cs.LG·May 11, 2026

Target-Aware Data Augmentation for SAT Prediction

Eshed Gal, Uri Ascher, Eldad Haber

PDF

TL;DR

This paper introduces a solver-free, target-aware data augmentation method for SAT problems and a specialized GNN architecture, significantly improving data generation efficiency and model performance.

Contribution

It presents a novel synthetic data generation framework aligned with target benchmarks and a GNN that leverages optimization structure, advancing learning on NP-hard problems.

Findings

01

Orders-of-magnitude faster data generation

02

Synthetic data effectively augments solver-labeled datasets

03

Model exploits underlying optimization structure

Abstract

Learning-based approaches to NP-hard problems have shown increasing promise, but their progress is fundamentally constrained by the high cost of generating labeled training data. In domains such as Boolean satisfiability (SAT), standard pipelines rely on solver-in-the-loop labeling, which scales poorly with problem size and limits the amount of usable supervision. This bottleneck hinders the broader goal of leveraging machine learning to capture structure in hard combinatorial problems. In this work, we propose a target-aware, solver-free data generation framework for SAT that produces correctly labeled SAT and UNSAT instances by construction, eliminating the need for expensive solver calls. Our method aligns generated instances with the structural properties of a target benchmark, making synthetic data effective for downstream learning. We further develop a linear-programming-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.