Dynamic Language Models for Continuously Evolving Content
Spurthi Amba Hombaiah, Tao Chen, Mingyang Zhang, Michael, Bendersky, Marc Najork

TL;DR
This paper investigates how to adapt BERT-based language models for web content that continuously evolves, proposing incremental training methods that improve performance and reduce costs in dynamic environments.
Contribution
The paper introduces novel incremental training techniques for BERT models that effectively handle semantic shifts and new tokens in evolving web content.
Findings
Incremental training outperforms training from scratch in cost and performance.
Methods improve downstream tasks like hashtag prediction and offensive content detection.
Incremental models adapt better to semantic changes over time.
Abstract
The content on the web is in a constant state of flux. New entities, issues, and ideas continuously emerge, while the semantics of the existing conversation topics gradually shift. In recent years, pre-trained language models like BERT greatly improved the state-of-the-art for a large spectrum of content understanding tasks. Therefore, in this paper, we aim to study how these language models can be adapted to better handle continuously evolving web content. In our study, we first analyze the evolution of 2013 - 2019 Twitter data, and unequivocally confirm that a BERT model trained on past tweets would heavily deteriorate when directly applied to data from later years. Then, we investigate two possible sources of the deterioration: the semantic shift of existing tokens and the sub-optimal or failed understanding of new tokens. To this end, we both explore two different vocabulary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Linear Layer · Adam · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Residual Connection · WordPiece · Attention Dropout · Dense Connections · Softmax
