Learning How to Cube

Ferhat Erata; Sam Kouteili; Thanos Typaldos; Timos Antonopoulos; Robert B. Jones; Byron Cook; Ruzica Piskac

arXiv:2605.16632·cs.LG·May 19, 2026

Learning How to Cube

Ferhat Erata, Sam Kouteili, Thanos Typaldos, Timos Antonopoulos, Robert B. Jones, Byron Cook, Ruzica Piskac

PDF

TL;DR

This paper demonstrates that transformer-based models can learn effective cubing heuristics for SAT solving, outperforming some symbolic methods through a novel neuro-symbolic training framework.

Contribution

Introduces a neuro-symbolic post-training framework with MCTS-based data curation and preference optimization for SAT cubing heuristics in transformers.

Findings

01

A 4B-parameter model achieves pass@5 score of 53 on SAT benchmarks.

02

Supervised fine-tuning improves performance from 46 to 51, DPO adds 2 more benchmarks.

03

Transformers can effectively learn cubing decisions traditionally dominated by symbolic methods.

Abstract

Despite the effectiveness of Cube-and-Conquer (C&C) for solving challenging Boolean Satisfiability (SAT) problems, no prior work has shown that transformer-based models can learn effective cubing heuristics. We introduce a neuro-symbolic post-training framework for this task. We design an MCTS-based data curation pipeline that uses symbolic heuristics to explore splitting decisions over SAT competition formulas, producing preference data grounded in solver statistics and augmented with reasoning traces from a teacher model. Our two-stage post-training, supervised fine-tuning (SFT) followed by direct preference optimization (DPO), enables a 4B-parameter model to achieve a pass@5 score of 53 on 100 SAT competition benchmarks, surpassing frontier LLMs such as Claude-Sonnet-4 (50) and matching the best symbolic heuristic (53). Ablations show that SFT alone improves pass@5 from 46 to 51,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.