Time-Reversal Provides Unsupervised Feedback to LLMs
Yerram Varun, Rahul Madhavan, Sravanti Addepalli, Arun Suggala,, Karthikeyan Shanmugam, Prateek Jain

TL;DR
This paper introduces Time Reversed Language Models (TRLMs) that predict and score responses in reverse time, providing unsupervised feedback to improve LLM tasks like re-ranking, citation, retrieval, and safety filtering.
Contribution
The paper presents TRLMs trained in reverse token order, demonstrating their ability to complement forward models and enhance various LLM applications with unsupervised feedback.
Findings
Up to 5% improvement on AlpacaEval leaderboard.
TRLM scoring outperforms forward scoring in response ranking.
Significant reduction in false negatives for safety filters.
Abstract
Large Language Models (LLMs) are typically trained to predict in the forward direction of time. However, recent works have shown that prompting these models to look back and critique their own generations can produce useful feedback. Motivated by this, we explore the question of whether LLMs can be empowered to think (predict and score) backwards to provide unsupervised feedback that complements forward LLMs. Towards this, we introduce Time Reversed Language Models (TRLMs), which can score and generate queries when conditioned on responses, effectively functioning in the reverse direction of time. Further, to effectively infer in the response to query direction, we pre-train and fine-tune a language model (TRLM-Ba) in the reverse token order from scratch. We show empirically (and theoretically in a stylized setting) that time-reversed models can indeed complement forward model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Materials Characterization Techniques · Geophysical Methods and Applications · Mineral Processing and Grinding
