Designing large language model prompts to extract scores from messy text: A shared dataset and challenge

Mike Thelwall

arXiv:2601.18271·cs.DL·January 27, 2026

Designing large language model prompts to extract scores from messy text: A shared dataset and challenge

Mike Thelwall

PDF

Open Access

TL;DR

This paper introduces a shared dataset of messy texts with research scores, and challenges the community to design prompts for LLMs to accurately extract these scores, aiming to improve over a baseline accuracy of 72.6%.

Contribution

It provides a novel dataset and challenge framework for prompt design to extract structured scores from unstructured, messy texts using LLMs.

Findings

01

Baseline accuracy of 72.6% for score extraction

02

Dataset includes 1446 texts with varying score formats

03

Challenge aims to improve prompt-based extraction methods

Abstract

In some areas of computing, natural language processing and information science, progress is made by sharing datasets and challenging the community to design the best algorithm for an associated task. This article introduces a shared dataset of 1446 short texts, each of which describes a research quality score on the UK scale of 1* to 4*. This is a messy collection, with some texts not containing scores and others including invalid scores or strange formats. With this dataset there is also a description of what constitutes a valid score and a "gold standard" of the correct scores for these texts (including missing values). The challenge is to design a prompt for Large Language Models (LLMs) to extract the scores from these texts as accurately as possible. The format for the response should be a number and no other text so there are two aspects to the challenge: ensuring that the LLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Text Readability and Simplification · Topic Modeling