CircuitBuilder: From Polynomials to Circuits via Reinforcement Learning
Weikun K. Zhang, Rohan Pandey, Bhaumik Mehta, Kaijie Jin, Naomi Morato, Archit Ganapule, Michael Ruofan Zeng, Jarod Alper

TL;DR
This paper introduces a reinforcement learning framework for synthesizing arithmetic circuits that compute polynomials, demonstrating the effectiveness of RL algorithms like SAC and PPO+MCTS in circuit discovery tasks.
Contribution
It formulates polynomial circuit synthesis as a reinforcement learning problem and compares RL algorithms, showing their potential for efficient circuit construction.
Findings
SAC outperforms other methods on two-variable targets.
PPO+MCTS scales to three variables and improves on complex instances.
Reinforcement learning effectively guides circuit synthesis for polynomials.
Abstract
Motivated by auto-proof generation and Valiant's VP vs. VNP conjecture, we study the problem of discovering efficient arithmetic circuits to compute polynomials, using addition and multiplication gates. We formulate this problem as a single-player game, where an RL agent attempts to build the circuit within a fixed number of operations. We implement an AlphaZero-style training loop and compare two approaches: Proximal Policy Optimization with Monte Carlo Tree Search (PPO+MCTS) and Soft Actor-Critic (SAC). SAC achieves the highest success rates on two-variable targets, while PPO+MCTS scales to three variables and demonstrates steady improvement on harder instances. These results suggest that polynomial circuit synthesis is a compact, verifiable setting for studying self-improving search policies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Artificial Intelligence in Games · Formal Methods in Verification
