Strategies for Span Labeling with Large Language Models

Danil Semin; Ond\v{r}ej Du\v{s}ek; Zden\v{e}k Kasner

arXiv:2601.16946·cs.CL·January 26, 2026

Strategies for Span Labeling with Large Language Models

Danil Semin, Ond\v{r}ej Du\v{s}ek, Zden\v{e}k Kasner

PDF

Open Access

TL;DR

This paper compares different span labeling strategies for large language models, introduces LogitMatch to improve content matching, and evaluates their effectiveness across multiple tasks.

Contribution

It categorizes span labeling strategies, proposes LogitMatch for better span alignment, and provides comprehensive evaluation results.

Findings

01

Tagging is a robust baseline.

02

LogitMatch improves span alignment.

03

LogitMatch outperforms other methods in some setups.

Abstract

Large language models (LLMs) are increasingly used for text analysis tasks, such as named entity recognition or error detection. Unlike encoder-based models, however, generative architectures lack an explicit mechanism to refer to specific parts of their input. This leads to a variety of ad-hoc prompting strategies for span labeling, often with inconsistent results. In this paper, we categorize these strategies into three families: tagging the input text, indexing numerical positions of spans, and matching span content. To address the limitations of content matching, we introduce LogitMatch, a new constrained decoding method that forces the model's output to align with valid input spans. We evaluate all methods across four diverse tasks. We find that while tagging remains a robust baseline, LogitMatch improves upon competitive matching-based methods by eliminating span matching issues…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Authorship Attribution and Profiling