Towards Continual Entity Learning in Language Models for Conversational Agents
Ravi Teja Gadde, Ivan Bulyko

TL;DR
This paper introduces entity-aware language models that integrate entity-specific information into pre-trained models, enabling dynamic updates and improved handling of new entities in conversational AI without full retraining.
Contribution
The authors propose a novel method to incorporate independently updatable entity models into pre-trained language models for continual entity learning in dialogue systems.
Findings
Significant perplexity reduction on dialogue datasets.
Enhanced ability to adapt to new entities over time.
Improved performance on long-tailed utterances.
Abstract
Neural language models (LM) trained on diverse corpora are known to work well on previously seen entities, however, updating these models with dynamically changing entities such as place names, song titles and shopping items requires re-training from scratch and collecting full sentences containing these entities. We aim to address this issue, by introducing entity-aware language models (EALM), where we integrate entity models trained on catalogues of entities into the pre-trained LMs. Our combined language model adaptively adds information from the entity models into the pre-trained LM depending on the sentence context. Our entity models can be updated independently of the pre-trained LM, enabling us to influence the distribution of entities output by the final LM, without any further training of the pre-trained LM. We show significant perplexity improvements on task-oriented dialogue…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
