Extraction of Salient Sentences from Labelled Documents
Misha Denil, Alban Demiraj, Nando de Freitas

TL;DR
This paper introduces a hierarchical convolutional model for extracting salient sentences from labeled documents, utilizing visualization techniques for interpretability and proposing a scalable evaluation method to assess extraction quality.
Contribution
The paper presents a novel hierarchical convolutional architecture combined with visualization methods and a scalable evaluation technique for sentence extraction.
Findings
Effective identification of topic-relevant sentences using visualization
Scalable evaluation method reduces reliance on manual annotation
Model supports interpretability of document structure
Abstract
We present a hierarchical convolutional document model with an architecture designed to support introspection of the document structure. Using this model, we show how to use visualisation techniques from the computer vision literature to identify and extract topic-relevant sentences. We also introduce a new scalable evaluation technique for automatic sentence extraction systems that avoids the need for time consuming human annotation of validation data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
