KeyVec: Key-semantics Preserving Document Representations

Bin Bi; Hao Ma

arXiv:1709.09749·cs.CL·September 29, 2017

KeyVec: Key-semantics Preserving Document Representations

Bin Bi, Hao Ma

PDF

Open Access

TL;DR

This paper introduces KeyVec, a neural network model that generates document embeddings preserving key semantics, topics, and important information, thereby improving performance in downstream NLP tasks.

Contribution

The paper presents a novel neural network model, KeyVec, specifically designed to produce document representations that retain essential semantic content and topics.

Findings

01

KeyVec outperforms existing methods in document understanding tasks.

02

Document embeddings from KeyVec better preserve key semantics and topics.

03

Empirical results demonstrate the effectiveness of KeyVec in practical NLP applications.

Abstract

Previous studies have demonstrated the empirical success of word embeddings in various applications. In this paper, we investigate the problem of learning distributed representations for text documents which many machine learning algorithms take as input for a number of NLP tasks. We propose a neural network model, KeyVec, which learns document representations with the goal of preserving key semantics of the input text. It enables the learned low-dimensional vectors to retain the topics and important information from the documents that will flow to downstream tasks. Our empirical evaluations show the superior quality of KeyVec representations in two different document understanding tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques