Recall, Expand and Multi-Candidate Cross-Encode: Fast and Accurate   Ultra-Fine Entity Typing

Chengyue Jiang; Wenyang Hui; Yong Jiang; Xiaobin Wang; Pengjun Xie,; Kewei Tu

arXiv:2212.09125·cs.CL·December 20, 2022

Recall, Expand and Multi-Candidate Cross-Encode: Fast and Accurate Ultra-Fine Entity Typing

Chengyue Jiang, Wenyang Hui, Yong Jiang, Xiaobin Wang, Pengjun Xie,, Kewei Tu

PDF

Open Access 1 Repo

TL;DR

This paper introduces MCCE, a fast and accurate method for ultra-fine entity typing that significantly reduces inference time by using a recall-expand-filter approach with a novel multi-candidate encoding model, achieving state-of-the-art results.

Contribution

The paper proposes MCCE, a multi-candidate encoding model that enables single-pass scoring of top candidate types, improving speed and accuracy in ultra-fine entity typing.

Findings

01

MCCE achieves state-of-the-art performance on UFET.

02

MCCE is thousands of times faster than traditional cross-encoder methods.

03

Effective in both fine-grained and coarse-grained entity typing.

Abstract

Ultra-fine entity typing (UFET) predicts extremely free-formed types (e.g., president, politician) of a given entity mention (e.g., Joe Biden) in context. State-of-the-art (SOTA) methods use the cross-encoder (CE) based architecture. CE concatenates the mention (and its context) with each type and feeds the pairs into a pretrained language model (PLM) to score their relevance. It brings deeper interaction between mention and types to reach better performance but has to perform N (type set size) forward passes to infer types of a single mention. CE is therefore very slow in inference when the type set is large (e.g., N = 10k for UFET). To this end, we propose to perform entity typing in a recall-expand-filter manner. The recall and expand stages prune the large type set and generate K (K is typically less than 256) most relevant type candidates for each mention. At the filter stage, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

modelscope/adaseq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning