SciZoom: A Large-scale Benchmark for Hierarchical Scientific Summarization across the LLM Era

Han Jang; Junhyeok Lee; Kyu Sung Choi

arXiv:2603.16131·cs.CL·March 18, 2026

SciZoom: A Large-scale Benchmark for Hierarchical Scientific Summarization across the LLM Era

Han Jang, Junhyeok Lee, Kyu Sung Choi

PDF

Open Access 1 Datasets

TL;DR

SciZoom is a comprehensive benchmark dataset of nearly 45,000 scientific papers designed to evaluate hierarchical summarization methods and analyze linguistic shifts in scientific writing before and after the widespread adoption of LLMs like ChatGPT.

Contribution

It introduces a large-scale, multi-granularity scientific summarization benchmark across the LLM era, enabling research on summarization and linguistic evolution in scientific discourse.

Findings

01

Detected significant shifts in phrase patterns and rhetorical styles post-LLM adoption.

02

Provided evidence of more confident but homogenized scientific writing with LLM assistance.

03

Created a publicly available dataset for future research in scientific summarization and discourse analysis.

Abstract

The explosive growth of AI research has created unprecedented information overload, increasing the demand for scientific summarization at multiple levels of granularity beyond traditional abstracts. While LLMs are increasingly adopted for summarization, existing benchmarks remain limited in scale, target only a single granularity, and predate the LLM era. Moreover, since the release of ChatGPT in November 2022, researchers have rapidly adopted LLMs for drafting manuscripts themselves, fundamentally transforming scientific writing, yet no resource exists to analyze how this writing has evolved. To bridge these gaps, we introduce SciZoom, a benchmark comprising 44,946 papers from four top-tier ML venues (NeurIPS, ICLR, ICML, EMNLP) spanning 2020 to 2025, explicitly stratified into Pre-LLM and Post-LLM eras. SciZoom provides three hierarchical summarization targets (Abstract,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

hanjang/SciZoom
dataset· 35 dl
35 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Text Readability and Simplification