Calibrated Seq2seq Models for Efficient and Generalizable Ultra-fine   Entity Typing

Yanlin Feng; Adithya Pratapa; David R Mortensen

arXiv:2311.00835·cs.CL·November 3, 2023·1 cites

Calibrated Seq2seq Models for Efficient and Generalizable Ultra-fine Entity Typing

Yanlin Feng, Adithya Pratapa, David R Mortensen

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces CASENT, a calibrated seq2seq model for ultra-fine entity typing that improves accuracy, calibration, and efficiency, and demonstrates strong zero-shot and few-shot generalization capabilities across diverse datasets.

Contribution

The paper presents CASENT, a novel seq2seq approach with calibration for ultra-fine entity typing, outperforming prior methods in accuracy, speed, and generalization.

Findings

01

Outperforms previous state-of-the-art in F1 score and calibration error.

02

Achieves over 50 times inference speedup.

03

Excels in zero-shot and few-shot settings, surpassing larger language models.

Abstract

Ultra-fine entity typing plays a crucial role in information extraction by predicting fine-grained semantic types for entity mentions in text. However, this task poses significant challenges due to the massive number of entity types in the output space. The current state-of-the-art approaches, based on standard multi-label classifiers or cross-encoder models, suffer from poor generalization performance or inefficient inference. In this paper, we present CASENT, a seq2seq model designed for ultra-fine entity typing that predicts ultra-fine types with calibrated confidence scores. Our model takes an entity mention as input and employs constrained beam search to generate multiple types autoregressively. The raw sequence probabilities associated with the predicted types are then transformed into confidence scores using a novel calibration method. We conduct extensive experiments on the UFET…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yanlinf/casent
pytorchOfficial

Models

🤗
yanlinf/casent-large
model· 3 dl· ♡ 3
3 dl♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence