Calibrated Seq2seq Models for Efficient and Generalizable Ultra-fine Entity Typing
Yanlin Feng, Adithya Pratapa, David R Mortensen

TL;DR
This paper introduces CASENT, a calibrated seq2seq model for ultra-fine entity typing that improves accuracy, calibration, and efficiency, and demonstrates strong zero-shot and few-shot generalization capabilities across diverse datasets.
Contribution
The paper presents CASENT, a novel seq2seq approach with calibration for ultra-fine entity typing, outperforming prior methods in accuracy, speed, and generalization.
Findings
Outperforms previous state-of-the-art in F1 score and calibration error.
Achieves over 50 times inference speedup.
Excels in zero-shot and few-shot settings, surpassing larger language models.
Abstract
Ultra-fine entity typing plays a crucial role in information extraction by predicting fine-grained semantic types for entity mentions in text. However, this task poses significant challenges due to the massive number of entity types in the output space. The current state-of-the-art approaches, based on standard multi-label classifiers or cross-encoder models, suffer from poor generalization performance or inefficient inference. In this paper, we present CASENT, a seq2seq model designed for ultra-fine entity typing that predicts ultra-fine types with calibrated confidence scores. Our model takes an entity mention as input and employs constrained beam search to generate multiple types autoregressively. The raw sequence probabilities associated with the predicted types are then transformed into confidence scores using a novel calibration method. We conduct extensive experiments on the UFET…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence
