TL;DR
This paper introduces a neural language model that incorporates author identities and temporal information to better capture language diffusion and author dynamics over time, outperforming existing baselines.
Contribution
The paper presents a novel recurrent neural language model conditioned on author and temporal vectors, enabling dynamic author representations and improved modeling of language evolution.
Findings
Outperforms several temporal and non-temporal baselines on real-world datasets
Learns meaningful author representations that change over time
Effectively captures language diffusion in author communities
Abstract
Language models are at the heart of numerous works, notably in the text mining and information retrieval communities. These statistical models aim at extracting word distributions, from simple unigram models to recurrent approaches with latent variables that capture subtle dependencies in texts. However, those models are learned from word sequences only, and authors' identities, as well as publication dates, are seldom considered. We propose a neural model, based on recurrent language modeling, which aims at capturing language diffusion tendencies in author communities through time. By conditioning language models with author and temporal vector states, we are able to leverage the latent dependencies between the text contexts. This allows us to beat several temporal and non-temporal language baselines on two real-world corpora, and to learn meaningful author representations that vary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
