Algorithmic Language Models with Neurally Compiled Libraries

Lucas Saldyt; Subbarao Kambhampati

arXiv:2407.04899·cs.AI·May 27, 2025

Algorithmic Language Models with Neurally Compiled Libraries

Lucas Saldyt, Subbarao Kambhampati

PDF

Open Access 3 Reviews

TL;DR

This paper proposes augmenting large language models with differentiable libraries of fundamental operations and algorithms, enabling more robust reasoning and planning capabilities by directly compiling algorithms into the model architecture.

Contribution

It introduces a method for integrating differentiable algorithms into LLMs, enhancing their ability to perform algorithmic tasks without learning from scratch.

Findings

01

Feasibility demonstrated with small transformers on simple tasks

02

Augmentation improves reasoning capabilities

03

Direct compilation of algorithms into models

Abstract

Important tasks such as reasoning and planning are fundamentally algorithmic, meaning that solving them robustly requires acquiring true reasoning or planning algorithms, rather than shortcuts. Large Language Models lack true algorithmic ability primarily because of the limitations of neural network optimization algorithms, their optimization data and optimization objective, but also due to architectural inexpressivity. To solve this, our paper proposes augmenting LLMs with a library of fundamental operations and sophisticated differentiable programs, so that common algorithms do not need to be learned from scratch. We add memory, registers, basic operations, and adaptive recurrence to a transformer architecture built on LLaMA3. Then, we define a method for directly compiling algorithms into a differentiable starting library, which is used natively and propagates gradients for…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 3

Strengths

Address an important problem, which is well motivated. The background section is comprehensive and an interesting read. It proposes an interesting approach of using a differentiable interpreter, as well as preparing a library of programs to choose from, by compiling symbolic programs into differentiable versions.

Weaknesses

Poor introduction: most of the introduction, apart from the very last paragraph is dedicated to motivating the work. The very last paragraph has 1 sentence which describes the methodology. Section 3, Methodology: The authors should make it clear what their contributions are. I am left with the impression that the majority of this section (apart from 3.4) are ideas from a previous paper that are just re-stated here. If this is the case, it should be stated more clearly. In itself, 3.4 is very br

Reviewer 02Rating 6Confidence 3

Strengths

- Reasoning, arithmetic and algorithmic abilities are still weak spot of LLM . 'Algorithmic Language Models with Neurally Compiled Libraries' suggest interesting and promising approach to improve capabilities of LLM. - Authors analyse impact of recursion depth on trainability on Fibonacci dataset - They create library of differentiable modules and augment LLM with them

Weaknesses

- Neurally Compiled Library is primary experimental work. Therefore I believe work would greatly benefit from extending it's evaluation on more, preferable public datasets. Also more detailed about experiments performed (for example what dataset was used for Airithmetic testing on page 8) would make it more interesting. - It would be helpful to have baselines with/without differential modules (i.e Figure 3, Table 2) - Model background section would benefit from either added citations or clear

Reviewer 03Rating 3Confidence 5

Strengths

The authors' proposal is definitely original, the paper outlines it in a mostly clear manner, and there is reason to believe that such augmentations, once refined and properly scaled, could indeed prove to be invaluable in making foundation models capable of algorithmic reasoning.

Weaknesses

Ultimately, the author's proposal does not seem to work well enough given the evaluations they present, and by their own admission their paper is more of a initial proof of concept (and a limited one at that) rather than a practical demonstration of the soundness of their approach. In this regard, I cannot provide any more suggestions for improvement than the authors already do in section 6; the paper in its present state is ultimately more suited to be a workshop publication than a main confere

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsLib