Outline Generation: Understanding the Inherent Content Structure of Documents
Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, and Xueqi Cheng

TL;DR
This paper introduces the Outline Generation task to identify document structures and generate section headings, proposing a hierarchical neural model that captures multi-level coherence, and provides a large dataset for future research.
Contribution
The paper formulates OG as a hierarchical structured prediction problem and proposes HiStGen, a novel neural model that captures multi-level coherence for improved outline generation.
Findings
HiStGen outperforms state-of-the-art models on the WIKIOG dataset.
The model effectively captures paragraph and heading dependencies.
The WIKIOG dataset contains over 1.75 million document-outline pairs.
Abstract
In this paper, we introduce and tackle the Outline Generation (OG) task, which aims to unveil the inherent content structure of a multi-paragraph document by identifying its potential sections and generating the corresponding section headings. Without loss of generality, the OG task can be viewed as a novel structured summarization task. To generate a sound outline, an ideal OG model should be able to capture three levels of coherence, namely the coherence between context paragraphs, that between a section and its heading, and that between context headings. The first one is the foundation for section identification, while the latter two are critical for consistent heading generation. In this work, we formulate the OG task as a hierarchical structured prediction problem, i.e., to first predict a sequence of section boundaries and then a sequence of section headings accordingly. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
