Towards Automated Functional Equation Proving: A Benchmark Dataset and A   Domain-Specific In-Context Agent

Mahdi Buali; Robert Hoehndorf

arXiv:2407.14521·cs.AI·July 23, 2024

Towards Automated Functional Equation Proving: A Benchmark Dataset and A Domain-Specific In-Context Agent

Mahdi Buali, Robert Hoehndorf

PDF

Open Access

TL;DR

This paper presents FEAS, an agent that improves automated functional equation proving in Lean by using domain-specific heuristics and a new benchmark dataset, demonstrating superior performance over baselines.

Contribution

Introduction of FEAS, a domain-specific in-context learning agent for functional equation proving, and the FunEq dataset for benchmarking ATP methods in this domain.

Findings

01

FEAS outperforms baseline methods on the FunEq dataset.

02

Incorporating domain-specific heuristics improves proof generation accuracy.

03

FEAS effectively formalizes high-level proof strategies into Lean.

Abstract

Automated Theorem Proving (ATP) faces challenges due to its complexity and computational demands. Recent work has explored using Large Language Models (LLMs) for ATP action selection, but these methods can be resource-intensive. This study introduces FEAS, an agent that enhances the COPRA in-context learning framework within Lean. FEAS refines prompt generation, response parsing, and incorporates domain-specific heuristics for functional equations. It introduces FunEq, a curated dataset of functional equation problems with varying difficulty. FEAS outperforms baselines on FunEq, particularly with the integration of domain-specific heuristics. The results demonstrate FEAS's effectiveness in generating and formalizing high-level proof strategies into Lean proofs, showcasing the potential of tailored approaches for specific ATP challenges.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReservoir Engineering and Simulation Methods