Small Language Model Makes an Effective Long Text Extractor

Yelin Chen; Fanjin Zhang; Jie Tang

arXiv:2502.07286·cs.CL·February 12, 2025

Small Language Model Makes an Effective Long Text Extractor

Yelin Chen, Fanjin Zhang, Jie Tang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents SeNER, a lightweight span-based NER method that effectively extracts long entity spans from extended texts using innovative attention mechanisms, achieving state-of-the-art accuracy while being GPU-memory efficient.

Contribution

Introduces SeNER, a novel span-based NER approach with bidirectional arrow attention and LogN-Scaling, reducing redundancy and improving long text entity extraction.

Findings

01

Achieves state-of-the-art accuracy on three long NER datasets.

02

Capable of extracting entities from long texts efficiently in GPU memory.

03

Outperforms existing span-based and generation-based methods.

Abstract

Named Entity Recognition (NER) is a fundamental problem in natural language processing (NLP). However, the task of extracting longer entity spans (e.g., awards) from extended texts (e.g., homepages) is barely explored. Current NER methods predominantly fall into two categories: span-based methods and generation-based methods. Span-based methods require the enumeration of all possible token-pair spans, followed by classification on each span, resulting in substantial redundant computations and excessive GPU memory usage. In contrast, generation-based methods involve prompting or fine-tuning large language models (LLMs) to adapt to downstream NER tasks. However, these methods struggle with the accurate generation of longer spans and often incur significant time costs for effective fine-tuning. To address these challenges, this paper introduces a lightweight span-based NER method called…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thudm/scholar-profiling
pytorchOfficial

Videos

Small Language Model Makes an Effective Long Text Extractor· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsSoftmax · Attention Is All You Need