Arbitrarily Large Labelled Random Satisfiability Formulas for Machine   Learning Training

Dimitris Achlioptas; Amrit Daswaney; Periklis A. Papakonstantinou

arXiv:2211.15368·cs.AI·June 6, 2023

Arbitrarily Large Labelled Random Satisfiability Formulas for Machine Learning Training

Dimitris Achlioptas, Amrit Daswaney, Periklis A. Papakonstantinou

PDF

Open Access

TL;DR

This paper introduces a method to generate arbitrarily large, correctly labeled random SAT formulas for training machine learning models, enabling better generalization to real-world problem sizes.

Contribution

It presents a probabilistic method to generate large labeled SAT formulas without solving them, facilitating scalable training for machine learning models.

Findings

01

State-of-the-art models perform no better than random on large formulas.

02

A new classifier achieves 99% accuracy on large datasets.

03

Learning based on solver computation prefixes offers a novel approach.

Abstract

Applying deep learning to solve real-life instances of hard combinatorial problems has tremendous potential. Research in this direction has focused on the Boolean satisfiability (SAT) problem, both because of its theoretical centrality and practical importance. A major roadblock faced, though, is that training sets are restricted to random formulas of size several orders of magnitude smaller than formulas of practical interest, raising serious concerns about generalization. This is because labeling random formulas of increasing size rapidly becomes intractable. By exploiting the probabilistic method in a fundamental way, we remove this roadblock entirely: we show how to generate correctly labeled random formulas of any desired size, without having to solve the underlying decision problem. Moreover, the difficulty of the classification task for the formulas produced by our generator is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Bayesian Modeling and Causal Inference · Machine Learning and Data Classification