Reinforced Fast Weights with Next-Sequence Prediction
Hee Seung Hwang, Xindi Wu, Sanghyuk Chun, Olga Russakovsky

TL;DR
This paper introduces REFINE, a reinforcement learning framework that enhances fast weight models for long-context tasks by optimizing sequence-level rewards, outperforming traditional next-token prediction methods.
Contribution
REFINE is a novel reinforcement learning approach that trains fast weight models with a next-sequence prediction objective, improving long-range dependency modeling.
Findings
REFINE outperforms supervised fine-tuning with NTP on multiple long-context tasks.
It is effective during various training stages, including mid-training, post-training, and test-time.
Experiments on LaCT-760M and DeltaNet-1.3B show consistent improvements.
Abstract
Fast weight architectures offer a promising alternative to attention-based transformers for long-context modeling by maintaining constant memory overhead regardless of context length. However, their potential is limited by the next-token prediction (NTP) training paradigm. NTP optimizes single-token predictions and ignores semantic coherence across multiple tokens following a prefix. Consequently, fast weight models, which dynamically update their parameters to store contextual information, learn suboptimal representations that fail to capture long-range dependencies. We introduce REFINE (Reinforced Fast weIghts with Next sEquence prediction), a reinforcement learning framework that trains fast weight models under the next-sequence prediction (NSP) objective. REFINE selects informative token positions based on prediction entropy, generates multi-token rollouts, assigns self-supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
