Temporal Sampling for Forgotten Reasoning in LLMs

Yuetai Li; Zhangchen Xu; Fengqing Jiang; Bhaskar Ramasubramanian; Luyao Niu; Bill Yuchen Lin; Xiang Yue; Radha Poovendran

arXiv:2505.20196·cs.AI·May 27, 2025

Temporal Sampling for Forgotten Reasoning in LLMs

Yuetai Li, Zhangchen Xu, Fengqing Jiang, Bhaskar Ramasubramanian, Luyao Niu, Bill Yuchen Lin, Xiang Yue, Radha Poovendran

PDF

Open Access 1 Repo

TL;DR

This paper identifies a phenomenon called temporal forgetting in fine-tuned LLMs, where models lose previously acquired reasoning skills, and proposes Temporal Sampling, a decoding strategy that recovers these skills by leveraging checkpoints during training.

Contribution

The paper introduces Temporal Sampling, a novel decoding method that retrieves forgotten reasoning abilities in LLMs by utilizing multiple training checkpoints, improving performance without retraining.

Findings

01

Temporal Sampling significantly improves reasoning accuracy across benchmarks.

02

The method works for both full fine-tuning and LoRA-adapted models.

03

Temporal diversity in training checkpoints enhances LLM reasoning capabilities.

Abstract

Fine-tuning large language models (LLMs) is intended to improve their reasoning capabilities, yet we uncover a counterintuitive effect: models often forget how to solve problems they previously answered correctly during training. We term this phenomenon temporal forgetting and show that it is widespread across model sizes, fine-tuning methods (both Reinforcement Learning and Supervised Fine-Tuning), and multiple reasoning benchmarks. To address this gap, we introduce Temporal Sampling, a simple decoding strategy that draws outputs from multiple checkpoints along the training trajectory. This approach recovers forgotten solutions without retraining or ensembling, and leads to substantial improvements in reasoning performance, gains from 4 to 19 points in Pass@k and consistent gains in Majority@k across several benchmarks. We further extend our method to LoRA-adapted models, demonstrating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uw-nsl/temporal_forgetting
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Artificial Intelligence in Law

MethodsAdapter