Do Lexical and Contextual Coreference Resolution Systems Degrade Differently under Mention Noise? An Empirical Study on Scientific Software Mentions

Atilla Kaan Alkan; Felix Grezes; Jennifer Lynn Bartlett; Anna Kelbert; Kelly Lockhart; Alberto Accomazzi

arXiv:2604.02171·cs.CL·April 3, 2026

Do Lexical and Contextual Coreference Resolution Systems Degrade Differently under Mention Noise? An Empirical Study on Scientific Software Mentions

Atilla Kaan Alkan, Felix Grezes, Jennifer Lynn Bartlett, Anna Kelbert, Kelly Lockhart, Alberto Accomazzi

PDF

1 Repo

TL;DR

This study compares lexical and contextual coreference resolution methods on scientific software mentions, revealing their different degradation patterns under noise and their scalability implications.

Contribution

It provides an empirical comparison of two approaches, highlighting their strengths, weaknesses, and scalability for software mention coreference resolution.

Findings

01

CAR outperforms FM by 1 point on test set

02

CAR degrades less under boundary noise

03

FM scales superlinearly with corpus size

Abstract

We present our participation in the SOMD 2026 shared task on cross-document software mention coreference resolution, where our systems ranked second across all three subtasks. We compare two fine-tuning-free approaches: Fuzzy Matching (FM), a lexical string-similarity method, and Context Aware Representations (CAR), which combines mention-level and document-level embeddings. Both achieve competitive performance across all subtasks (CoNLL F1 of 0.94-0.96), with CAR consistently outperforming FM by 1 point on the official test set, consistent with the high surface regularity of software names, which reduces the need for complex semantic reasoning. A controlled noise-injection study reveals complementary failure modes: as boundary noise increases, CAR loses only 0.07 F1 points from clean to fully corrupted input, compared to 0.20 for FM, whereas under mention substitution, FM degrades more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.