Data-Efficient Learning with Neural Programs
Alaia Solko-Breslin, Seewon Choi, Ziyang Li, Neelay Velingker, Rajeev, Alur, Mayur Naik, Eric Wong

TL;DR
This paper introduces ISED, an algorithm for learning neural programs that efficiently estimates gradients of black-box components using only input-output data, enabling effective training of composite models involving LLMs and neurosymbolic systems.
Contribution
The paper presents ISED, a novel gradient estimation algorithm for neural programs that works with black-box components and improves data efficiency over existing methods.
Findings
ISED achieves comparable accuracy to state-of-the-art neurosymbolic frameworks.
ISED is more data- and sample-efficient than prior gradient approximation methods.
Evaluation includes benchmarks with GPT-4 and neurosymbolic tasks.
Abstract
Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiable logic programming language, techniques from neurosymbolic learning are applicable, but in general, the learning for neural programs requires estimating the gradients of black-box components. We present an algorithm for learning neural programs, called ISED, that only relies on input-output samples of black-box components. For evaluation, we introduce new benchmarks that involve calls to modern LLMs such as GPT-4 and also consider benchmarks from the neurosymbolic learning literature. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Algorithms
MethodsAttention Is All You Need · Softmax · Focus · Layer Normalization · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Multi-Head Attention
