A Comprehensive Comparison of Word Embeddings in Event & Entity   Coreference Resolution

Judicael Poumay; Ashwin Ittoo

arXiv:2110.05115·cs.CL·October 12, 2021

A Comprehensive Comparison of Word Embeddings in Event & Entity Coreference Resolution

Judicael Poumay, Ashwin Ittoo

PDF

Open Access 1 Repo

TL;DR

This study compares various word embeddings for Event and Entity Coreference Resolution, revealing trade-offs between size and performance, and identifying which embeddings perform best in different scenarios.

Contribution

It provides a comprehensive comparison of multiple embeddings within and across families for coreference tasks, highlighting performance trade-offs and identifying top performers.

Findings

01

Diminishing returns in performance with increasing embedding size.

02

Larger models learn faster but are slower at test time.

03

Elmo performs best overall, GloVe and FastText excel in specific tasks.

Abstract

Coreference Resolution is an important NLP task and most state-of-the-art methods rely on word embeddings for word representation. However, one issue that has been largely overlooked in literature is that of comparing the performance of different embeddings across and within families in this task. Therefore, we frame our study in the context of Event and Entity Coreference Resolution (EvCR & EnCR), and address two questions : 1) Is there a trade-off between performance (predictive & run-time) and embedding size? 2) How do the embeddings' performance compare within and across families? Our experiments reveal several interesting findings. First, we observe diminishing returns in performance with respect to embedding size. E.g. a model using solely a character embedding achieves 86% of the performance of the largest model (Elmo, GloVe, Character) while being 1.2% of its size. Second, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

judicaelpoumay/event_entity_coref_ecb_plus
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques

MethodsTest · Sigmoid Activation · Tanh Activation · fastText · Long Short-Term Memory · GloVe Embeddings · Softmax · Bidirectional LSTM · ELMo