Extraction of Salient Sentences from Labelled Documents

Misha Denil; Alban Demiraj; Nando de Freitas

arXiv:1412.6815·cs.CL·March 3, 2015·85 cites

Extraction of Salient Sentences from Labelled Documents

Misha Denil, Alban Demiraj, Nando de Freitas

PDF

Open Access 2 Repos

TL;DR

This paper introduces a hierarchical convolutional model for extracting salient sentences from labeled documents, utilizing visualization techniques for interpretability and proposing a scalable evaluation method to assess extraction quality.

Contribution

The paper presents a novel hierarchical convolutional architecture combined with visualization methods and a scalable evaluation technique for sentence extraction.

Findings

01

Effective identification of topic-relevant sentences using visualization

02

Scalable evaluation method reduces reliance on manual annotation

03

Model supports interpretability of document structure

Abstract

We present a hierarchical convolutional document model with an architecture designed to support introspection of the document structure. Using this model, we show how to use visualisation techniques from the computer vision literature to identify and extract topic-relevant sentences. We also introduce a new scalable evaluation technique for automatic sentence extraction systems that avoids the need for time consuming human annotation of validation data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques