Library Learning Doesn't: The Curious Case of the Single-Use "Library"

Ian Berlot-Attwell; Frank Rudzicz; Xujie Si

arXiv:2410.20274·cs.LG·October 29, 2024

Library Learning Doesn't: The Curious Case of the Single-Use "Library"

Ian Berlot-Attwell, Frank Rudzicz, Xujie Si

PDF

Open Access 1 Repo

TL;DR

This paper investigates whether current library learning systems for mathematical reasoning genuinely learn reusable tools, finding that they rarely do and that performance improvements mainly stem from self-correction rather than reuse.

Contribution

The study critically evaluates two library learning systems, revealing that actual reuse of tools is infrequent and that performance gains are primarily due to self-correction mechanisms.

Findings

01

Function reuse is extremely infrequent in evaluated systems.

02

Self-correction and self-consistency drive most performance improvements.

03

Current library learning methods may not effectively learn reusable tools.

Abstract

Advances in Large Language Models (LLMs) have spurred a wave of LLM library learning systems for mathematical reasoning. These systems aim to learn a reusable library of tools, such as formal Isabelle lemmas or Python programs that are tailored to a family of tasks. Many of these systems are inspired by the human structuring of knowledge into reusable and extendable concepts, but do current methods actually learn reusable libraries of tools? We study two library learning systems for mathematics which both reported increased accuracy: LEGO-Prover and TroVE. We find that function reuse is extremely infrequent on miniF2F and MATH. Our followup ablation experiments suggest that, rather than reuse, self-correction and self-consistency are the primary drivers of the observed performance gains. Our code and data are available at https://github.com/ikb-a/curious-case

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ikb-a/curious-case
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLibrary Science and Administration

MethodsLib