End-to-end Neural Coreference Resolution Revisited: A Simple yet Effective Baseline
Tuan Manh Lai, Trung Bui, Doo Soon Kim

TL;DR
This paper presents a simplified yet highly effective neural coreference resolution baseline that outperforms recent complex models on the OntoNotes benchmark, emphasizing the value of simplicity in model design.
Contribution
The authors introduce a streamlined neural coreference resolution model leveraging pre-trained Transformers, achieving state-of-the-art results with less complexity.
Findings
Outperforms recent extended models on OntoNotes
Simplified model achieves competitive or better performance
Complex extensions may not always lead to better results
Abstract
Since the first end-to-end neural coreference resolution model was introduced, many extensions to the model have been proposed, ranging from using higher-order inference to directly optimizing evaluation metrics using reinforcement learning. Despite improving the coreference resolution performance by a large margin, these extensions add substantial extra complexity to the original model. Motivated by this observation and the recent advances in pre-trained Transformer language models, we propose a simple yet effective baseline for coreference resolution. Even though our model is a simplified version of the original neural coreference resolution model, it achieves impressive performance, outperforming all recent extended works on the public English OntoNotes benchmark. Our work provides evidence for the necessity of carefully justifying the complexity of existing or newly proposed models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Layer Normalization · Byte Pair Encoding · Dropout · Label Smoothing
