Lightweight Latent Reasoning for Narrative Tasks
Alexander Gurung, Nikolay Malkin, Mirella Lapata

TL;DR
LiteReason introduces a lightweight latent reasoning approach that enhances narrative task performance in large language models by reducing reasoning length and computational costs, while maintaining high accuracy.
Contribution
The paper presents LiteReason, a novel, efficient latent reasoning method that can be integrated with standard token sampling and reinforcement learning for narrative tasks.
Findings
Outperforms existing latent reasoning baselines.
Reduces reasoning length by 77-92%.
Achieves near non-latent RL performance.
Abstract
Large language models (LLMs) tackle complex tasks by generating long chains of thought or "reasoning traces" that act as latent variables in the generation of an output given a query. A model's ability to generate such traces can be optimized with reinforcement learning (RL) to improve their utility in predicting an answer. This optimization comes at a high computational cost, especially for narrative-related tasks that involve retrieving and processing many tokens. To this end, we propose LiteReason, a latent reasoning method that can be interleaved with standard token sampling and easily combined with RL techniques. LiteReason employs a lightweight Reasoning Projector module, trained to produce continuous latent tokens that help the model 'skip' reasoning steps. During RL, the policy model decides when to activate the projector, switching between latent and discrete reasoning as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education
