Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content
Nemanja Djuric, Hao Wu, Vladan Radosavljevic, Mihajlo Grbovic, Narayan, Bhamidipati

TL;DR
This paper introduces a hierarchical neural language model that jointly learns low-dimensional representations of streaming documents and their content, improving semantic understanding and enabling personalized applications.
Contribution
The paper presents a novel hierarchical neural framework that models document sequences and internal word sequences simultaneously, outperforming existing methods in representation quality.
Findings
Learned representations outperform state-of-the-art methods.
Effective in modeling streaming documents and content.
Applicable to personalized recommendation and social analysis.
Abstract
We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a hierarchical framework with two embedded neural language models. In particular, we exploit the context of documents in streams and use one of the language models to model the document sequences, and the other to model word sequences within them. The models learn continuous vector representations for both word tokens and documents such that semantically similar documents and words are close in a common vector space. We discuss extensions to our model, which can be applied to personalized recommendation and social relationship mining by adding further user layers to the hierarchy, thus learning user-specific vectors to represent individual preferences.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
