BiTimeBERT: Extending Pre-Trained Language Representations with   Bi-Temporal Information

Jiexin Wang; Adam Jatowt; Masatoshi Yoshikawa; Yi Cai

arXiv:2204.13032·cs.CL·April 28, 2023·22 cites

BiTimeBERT: Extending Pre-Trained Language Representations with Bi-Temporal Information

Jiexin Wang, Adam Jatowt, Masatoshi Yoshikawa, Yi Cai

PDF

Open Access

TL;DR

BiTimeBERT is a new language model trained on temporal news data that incorporates two types of temporal signals, significantly improving performance on time-sensitive NLP tasks compared to standard models like BERT.

Contribution

This work introduces BiTimeBERT, a novel pre-training approach that leverages long-span temporal news collections and two new tasks to create time-aware language representations.

Findings

01

BiTimeBERT outperforms BERT on various time-sensitive NLP tasks.

02

Achieves 155% accuracy improvement on event time estimation.

03

Demonstrates the importance of temporal signals in language modeling.

Abstract

Time is an important aspect of documents and is used in a range of NLP and IR tasks. In this work, we investigate methods for incorporating temporal information during pre-training to further improve the performance on time-related tasks. Compared with common pre-trained language models like BERT which utilize synchronic document collections (e.g., BookCorpus and Wikipedia) as the training corpora, we use long-span temporal news article collection for building word representations. We introduce BiTimeBERT, a novel language representation model trained on a temporal collection of news articles via two new pre-training tasks, which harnesses two distinct temporal signals to construct time-aware language representations. The experimental results show that BiTimeBERT consistently outperforms BERT and other existing pre-trained models with substantial gains on different downstream NLP tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Residual Connection · Attention Dropout · WordPiece · Weight Decay · Adam · Softmax · Layer Normalization