Modelling and Classifying the Components of a Literature Review

Francisco Bola\~nos; Angelo Salatino; Francesco Osborne; Enrico Motta

arXiv:2508.04337·cs.CL·February 11, 2026

Modelling and Classifying the Components of a Literature Review

Francisco Bola\~nos, Angelo Salatino, Francesco Osborne, Enrico Motta

PDF

TL;DR

This paper introduces a new annotation schema for classifying rhetorical roles in scientific literature and evaluates a wide range of large language models on this task, demonstrating high accuracy with fine-tuning and data augmentation.

Contribution

It presents a novel annotation schema for rhetorical roles and a comprehensive benchmark for evaluating LLMs on this classification task.

Findings

01

LLMs achieve over 96% F1 when fine-tuned on high-quality data.

02

Data augmentation with LLM-generated examples improves performance, especially for smaller models.

03

Both large proprietary and open-source models perform well on the task.

Abstract

Previous work has demonstrated that AI methods for analysing scientific literature benefit significantly from annotating sentences in papers according to their rhetorical roles, such as research gaps, results, limitations, extensions of existing methodologies, and others. Such representations also have the potential to support the development of a new generation of systems capable of producing high-quality literature reviews. However, achieving this goal requires the definition of a relevant annotation schema and effective strategies for large-scale annotation of the literature. This paper addresses these challenges in two ways: 1) it introduces a novel, unambiguous annotation schema that is explicitly designed for reliable automatic processing, and 2) it presents a comprehensive evaluation of a wide range of large language models (LLMs) on the task of classifying rhetorical roles…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.