$\textit{New News}$: System-2 Fine-tuning for Robust Integration of New Knowledge
Core Francisco Park, Zechen Zhang, Hidenori Tanaka

TL;DR
This paper introduces New News, a dataset for evaluating how well large language models can internalize new information through fine-tuning, and proposes System-2 Fine-tuning (Sys2-FT) with self-play protocols to improve knowledge integration.
Contribution
The paper presents a new dataset and a novel fine-tuning method, Sys2-FT, that enhances models' ability to internalize news information compared to naive fine-tuning.
Findings
Self-QA protocol improves in-weight learning of news.
Contextual shadowing effect degrades learning when rephrasing or QAs are used.
Preliminary scaling law of Sys2-FT performance emerges.
Abstract
Humans and intelligent animals can internalize new information and accurately internalize their implications to perform downstream tasks. While large language models (LLMs) can achieve this through in-context learning (ICL) when the information (news) is explicitly given as context, adequately integrating the information into model weights via fine-tuning remains challenging. In this paper, we introduce New News, a dataset composed of hypothetical yet plausible news spanning multiple domains (mathematics, coding, discoveries, leaderboards, events), accompanied by downstream evaluation questions whose correct answers critically depend on understanding and internalizing the news. First, we demonstrate a substantial gap between naive fine-tuning and in-context learning (FT-ICL gap) on our dataset. To address this gap, we explore a suite of self-play data generation protocols --…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
