General Intelligence Requires Reward-based Pretraining

Seungwook Han; Jyothish Pari; Samuel J. Gershman; Pulkit Agrawal

arXiv:2502.19402·cs.LG·August 27, 2025

General Intelligence Requires Reward-based Pretraining

Seungwook Han, Jyothish Pari, Samuel J. Gershman, Pulkit Agrawal

PDF

Open Access

TL;DR

This paper argues that achieving artificial general intelligence requires disentangling reasoning from knowledge in language models, proposing reward-based pretraining, curriculum learning, and generalizable reasoning functions to improve transferability and robustness.

Contribution

It introduces a novel framework that separates reasoning from knowledge in LLMs, using RL pretraining, synthetic curricula, and small context windows for better generalization.

Findings

01

RL pretraining improves reasoning transferability.

02

Synthetic task curricula facilitate reasoning prior learning.

03

Small context windows reduce reliance on spurious correlations.

Abstract

Large Language Models (LLMs) have demonstrated impressive real-world utility, exemplifying artificial useful intelligence (AUI). However, their ability to reason adaptively and robustly -- the hallmarks of artificial general intelligence (AGI) -- remains fragile. While LLMs seemingly succeed in commonsense reasoning, programming, and mathematics, they struggle to generalize algorithmic understanding across novel contexts. Our experiments with algorithmic tasks in esoteric programming languages reveal that LLM's reasoning overfits to the training data and is limited in its transferability. We hypothesize that the core issue underlying such limited transferability is the coupling of reasoning and knowledge in LLMs. To transition from AUI to AGI, we propose disentangling knowledge and reasoning through three key directions: (1) pretaining to reason using RL from scratch as an alternative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Machine Learning in Materials Science · Natural Language Processing Techniques