Target-Aware Data Augmentation for SAT Prediction
Eshed Gal, Uri Ascher, Eldad Haber

TL;DR
This paper introduces a solver-free, target-aware data augmentation method for SAT problems and a specialized GNN architecture, significantly improving data generation efficiency and model performance.
Contribution
It presents a novel synthetic data generation framework aligned with target benchmarks and a GNN that leverages optimization structure, advancing learning on NP-hard problems.
Findings
Orders-of-magnitude faster data generation
Synthetic data effectively augments solver-labeled datasets
Model exploits underlying optimization structure
Abstract
Learning-based approaches to NP-hard problems have shown increasing promise, but their progress is fundamentally constrained by the high cost of generating labeled training data. In domains such as Boolean satisfiability (SAT), standard pipelines rely on solver-in-the-loop labeling, which scales poorly with problem size and limits the amount of usable supervision. This bottleneck hinders the broader goal of leveraging machine learning to capture structure in hard combinatorial problems. In this work, we propose a target-aware, solver-free data generation framework for SAT that produces correctly labeled SAT and UNSAT instances by construction, eliminating the need for expensive solver calls. Our method aligns generated instances with the structural properties of a target benchmark, making synthetic data effective for downstream learning. We further develop a linear-programming-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
