babble: Learning Better Abstractions with E-Graphs and Anti-Unification
David Cao, Rose Kunkel, Chandrakana Nandi, Max Willsey, Zachary, Tatlock, Nadia Polikarpova

TL;DR
BABBLE leverages e-graphs and anti-unification to improve library learning by efficiently discovering reusable functions, achieving significant speedups and better handling of complex, syntactically varied inputs.
Contribution
Introduces LLMT, a novel library learning algorithm using e-graphs and anti-unification with equational theories, enabling scalable and robust program compression.
Findings
BABBLE achieves orders of magnitude faster compression than previous methods.
BABBLE learns reusable functions on more complex and varied inputs.
The approach effectively handles syntactic variations in program inputs.
Abstract
Library learning compresses a given corpus of programs by extracting common structure from the corpus into reusable library functions. Prior work on library learning suffers from two limitations that prevent it from scaling to larger, more complex inputs. First, it explores too many candidate library functions that are not useful for compression. Second, it is not robust to syntactic variation in the input. We propose library learning modulo theory (LLMT), a new library learning algorithm that additionally takes as input an equational theory for a given problem domain. LLMT uses e-graphs and equality saturation to compactly represent the space of programs equivalent modulo the theory, and uses a novel e-graph anti-unification technique to find common patterns in the corpus more directly and efficiently. We implemented LLMT in a tool named BABBLE. Our evaluation shows that BABBLE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
