RocqSmith: Can Automatic Optimization Forge Better Proof Agents?

Andrei Kozyrev; Nikita Khramov; Denis Lochmelis; Valerio Morelli; Gleb Solovev; Anton Podkopaev

arXiv:2602.05762·cs.AI·February 6, 2026

RocqSmith: Can Automatic Optimization Forge Better Proof Agents?

Andrei Kozyrev, Nikita Khramov, Denis Lochmelis, Valerio Morelli, Gleb Solovev, Anton Podkopaev

PDF

Open Access

TL;DR

This paper investigates the potential of automatic AI optimization methods to improve proof agents in formal verification, finding that simple approaches are effective but still lag behind expert-designed agents.

Contribution

It evaluates various automatic optimization techniques on a formal proof agent, highlighting the effectiveness of simple methods and the gap with expert-crafted solutions.

Findings

01

Simple few-shot bootstrapping is most consistently effective.

02

Automatic methods improve performance but do not surpass expert-designed agents.

03

Various optimizers yield measurable improvements.

Abstract

This work studies the applicability of automatic AI agent optimization methods to real-world agents in formal verification settings, focusing on automated theorem proving in Rocq as a representative and challenging domain. We evaluate how different automatic agent optimizers perform when applied to the task of optimizing a Rocq proof-generation agent, and assess whether parts of the fine-grained tuning of agentic systems, such as prompt design, contextual knowledge, and control strategies, can be automated. Our results show that while several optimizers yield measurable improvements, simple few-shot bootstrapping is the most consistently effective; however, none of the studied methods matches the performance of a carefully engineered state-of-the-art proof agent.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLogic, programming, and type systems · Formal Methods in Verification · Constraint Satisfaction and Optimization