A Generative Approach to Titling and Clustering Wikipedia Sections
Anjalie Field, Sascha Rothe, Simon Baumgartner, Cong Yu, and Abe, Ittycheriah

TL;DR
This paper explores transformer-based models for generating Wikipedia section titles and embeddings, demonstrating how different decoder architectures impact extractive heading generation and semantic encoding, with a new loss function enhancing embedding quality.
Contribution
It introduces a novel task of section heading generation, compares various transformer decoder architectures, and proposes a new loss function to improve section embedding quality.
Findings
Attention-based decoders excel at extractive heading generation.
Decoders without attention produce better semantic embeddings.
The new loss function enhances embedding quality.
Abstract
We evaluate the performance of transformer encoders with various decoders for information organization through a new task: generation of section headings for Wikipedia articles. Our analysis shows that decoders containing attention mechanisms over the encoder output achieve high-scoring results by generating extractive text. In contrast, a decoder without attention better facilitates semantic encoding and can be used to generate section embeddings. We additionally introduce a new loss function, which further encourages the decoder to generate high-quality embeddings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Wikis in Education and Collaboration
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Byte Pair Encoding
