CorPipe at CRAC 2024: Predicting Zero Mentions from Raw Text
Milan Straka

TL;DR
CorPipe 24 introduces a novel approach for multilingual coreference resolution that predicts empty nodes from raw text, outperforming previous methods and enabling more effective zero mention detection.
Contribution
The paper presents CorPipe 24, a new model that jointly predicts empty nodes and coreference links directly from raw text, advancing the state-of-the-art in multilingual coreference resolution.
Findings
CorPipe 24 outperforms competitors by 3.9 and 2.8 percentage points.
Two model variants achieve high accuracy in zero mention prediction.
The approach enables coreference resolution directly from raw text.
Abstract
We present CorPipe 24, the winning entry to the CRAC 2024 Shared Task on Multilingual Coreference Resolution. In this third iteration of the shared task, a novel objective is to also predict empty nodes needed for zero coreference mentions (while the empty nodes were given on input in previous years). This way, coreference resolution can be performed on raw text. We evaluate two model variants: a~two-stage approach (where the empty nodes are predicted first using a pretrained encoder model and then processed together with sentence words by another pretrained model) and a single-stage approach (where a single pretrained encoder model generates empty nodes, coreference mentions, and coreference links jointly). In both settings, CorPipe surpasses other participants by a large margin of 3.9 and 2.8 percent points, respectively. The source code and the trained model are available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques
