BAGELS: Benchmarking the Automated Generation and Extraction of Limitations from Scholarly Text

Ibrahim Al Azher; Miftahul Jannat Mokarrama; Zhishuai Guo; Sagnik Ray Choudhury; Hamed Alhoori

arXiv:2505.18207·cs.DL·September 23, 2025

BAGELS: Benchmarking the Automated Generation and Extraction of Limitations from Scholarly Text

Ibrahim Al Azher, Miftahul Jannat Mokarrama, Zhishuai Guo, Sagnik Ray Choudhury, Hamed Alhoori

PDF

2 Datasets 1 Video

TL;DR

This paper introduces a comprehensive system for automatically extracting and generating research limitations from scholarly articles, aiming to improve transparency and reproducibility in scientific reporting.

Contribution

It develops a new dataset, proposes a retrieval-augmented generation method, and creates an evaluation framework for limitations extraction and generation in scientific texts.

Findings

01

Created a dataset of limitations from multiple conferences and journals.

02

Developed a novel RAG-based approach for generating limitations.

03

Established an evaluation framework for assessing generated limitations.

Abstract

In scientific research, ``limitations'' refer to the shortcomings, constraints, or weaknesses of a study. A transparent reporting of such limitations can enhance the quality and reproducibility of research and improve public trust in science. However, authors often underreport limitations in their papers and rely on hedging strategies to meet editorial requirements at the expense of readers' clarity and confidence. This tendency, combined with the surge in scientific publications, has created a pressing need for automated approaches to extract and generate limitations from scholarly papers. To address this need, we present a full architecture for computational analysis of research limitations. Specifically, we (1) create a dataset of limitations from ACL, NeurIPS, and PeerJ papers by extracting them from the text and supplementing them with external reviews; (2) we propose methods to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

BAGELS: Benchmarking the Automated Generation and Extraction of Limitations from Scholarly Text· underline