Attention based Sentence Extraction from Scientific Articles using Pseudo-Labeled data
Parth Mehta, Gaurav Arora, Prasenjit Majumder

TL;DR
This paper introduces a weakly supervised, attention-based deep learning method for extracting important sentences from scientific articles, improving summary quality and coherence using pseudo-labeled data and novel context embeddings.
Contribution
It presents a new attention-based neural architecture with context embeddings for sentence extraction, outperforming existing methods on scientific article summarization.
Findings
Achieves higher ROUGE scores than state-of-the-art extractive methods
Generates more coherent and structurally faithful summaries
Utilizes pseudo-labeled data for weak supervision
Abstract
In this work, we present a weakly supervised sentence extraction technique for identifying important sentences in scientific papers that are worthy of inclusion in the abstract. We propose a new attention based deep learning architecture that jointly learns to identify important content, as well as the cue phrases that are indicative of summary worthy sentences. We propose a new context embedding technique for determining the focus of a given paper using topic models and use it jointly with an LSTM based sequence encoder to learn attention weights across the sentence words. We use a collection of articles publicly available through ACL anthology for our experiments. Our system achieves a performance that is better, in terms of several ROUGE metrics, as compared to several state of art extractive techniques. It also generates more coherent summaries and preserves the overall structure of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Natural Language Processing Techniques
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
