ScienceMeter: Tracking Scientific Knowledge Updates in Language Models

Yike Wang; Shangbin Feng; Yulia Tsvetkov; Hannaneh Hajishirzi

arXiv:2505.24302·cs.CL·July 1, 2025

ScienceMeter: Tracking Scientific Knowledge Updates in Language Models

Yike Wang, Shangbin Feng, Yulia Tsvetkov, Hannaneh Hajishirzi

PDF

Open Access 1 Repo

TL;DR

ScienceMeter is a framework for evaluating how well language models update and maintain scientific knowledge over time, highlighting current limitations and the need for more robust update methods.

Contribution

This paper introduces ScienceMeter, a comprehensive evaluation framework for scientific knowledge updates in language models, including new metrics and a large curated dataset.

Findings

01

Best methods preserve 85.9% of existing knowledge

02

Models acquire 71.7% of new scientific claims

03

Performance on objectives is correlated across domains

Abstract

Large Language Models (LLMs) are increasingly used to support scientific research, but their knowledge of scientific advancements can quickly become outdated. We introduce ScienceMeter, a new framework for evaluating scientific knowledge update methods over scientific knowledge spanning the past, present, and future. ScienceMeter defines three metrics: knowledge preservation, the extent to which models' understanding of previously learned papers are preserved; knowledge acquisition, how well scientific claims from newly introduced papers are acquired; and knowledge projection, the ability of the updated model to anticipate or generalize to related scientific claims that may emerge in the future. Using ScienceMeter, we examine the scientific knowledge of LLMs on claim judgment and generation tasks across a curated dataset of 15,444 scientific papers and 30,888 scientific claims from ten…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yikee/sciencemeter
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management