TEMPO: Temporal Enforcement via Mode-Separated Policy Optimization for Trustworthy LLM Backtesting

Zeyu Zhang; Bradly C. Stadie

arXiv:2605.18843·cs.LG·May 20, 2026

TEMPO: Temporal Enforcement via Mode-Separated Policy Optimization for Trustworthy LLM Backtesting

Zeyu Zhang, Bradly C. Stadie

PDF

TL;DR

TEMPO introduces a novel training method for large language models to enforce temporal discipline, reducing post-cutoff knowledge leakage and improving task accuracy by learning instance-specific temporal reasoning strategies.

Contribution

The paper proposes TEMPO, a two-mode reward system and GRPO-based training pipeline, to effectively minimize knowledge leakage and enhance temporal compliance in LLM backtesting.

Findings

01

Leakage reduced from 2-13% to 0.6-3.7% across tasks.

02

Task performance improved by 6-13% with TEMPO.

03

Training converges monotonically to leak-free solutions.

Abstract

Backtesting large language models on historical events requires reasoning exclusively from information available before a specified cutoff date. Yet models routinely leak post-cutoff knowledge from pre-training into their reasoning, inflating apparent accuracy and undermining evaluation validity. Prompt-based constraints fail when suppressed content is causally related to the prediction, and knowledge unlearning cannot address this problem because temporal compliance is instance-specific: the same fact may be legitimate evidence for one cutoff date and a violation for another. Rather than erasing knowledge, the model must learn temporal discipline: selecting evidence conditioned on each instance's cutoff date. We propose TEMPO (Temporal Enforcement via Mode-separated Policy Optimization), which trains this discipline via two contributions: (1) a two-mode reward where a leakage mode…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.