Diagnosing Representation Dynamics in NER Model Extension
Xirui Zhang, Philippe de La Chevasnerie, Benoit Fabre (papernest)

TL;DR
This paper investigates how NER models can be extended to new entity types without degrading performance, revealing key mechanisms like feature independence and the importance of 'O' tag flexibility.
Contribution
It provides a mechanistic diagnosis of NER model adaptation, identifying factors like representation overlap and the role of 'O' tag unfreezing in successful extension.
Findings
LOC entity is vulnerable due to pattern overlap
Unfreezing 'O' classifier enables learning new PII patterns
Representation overlap causes semantic drift in certain entities
Abstract
Extending Named Entity Recognition (NER) models to new PII entities in noisy spoken-language data is a common need. We find that jointly fine-tuning a BERT model on standard semantic entities (PER, LOC, ORG) and new pattern-based PII (EMAIL, PHONE) results in minimal degradation for original classes. We investigate this "peaceful coexistence," hypothesizing that the model uses independent semantic vs. morphological feature mechanisms. Using an incremental learning setup as a diagnostic tool, we measure semantic drift and find two key insights. First, the LOC (location) entity is uniquely vulnerable due to a representation overlap with new PII, as it shares pattern-like features (e.g., postal codes). Second, we identify a "reverse O-tag representation drift." The model, initially trained to map PII patterns to 'O', blocks new learning. This is resolved only by unfreezing the 'O' tag's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis · Speech and dialogue systems
