A Generative Approach to Titling and Clustering Wikipedia Sections

Anjalie Field; Sascha Rothe; Simon Baumgartner; Cong Yu; and Abe; Ittycheriah

arXiv:2005.11216·cs.CL·May 25, 2020

A Generative Approach to Titling and Clustering Wikipedia Sections

Anjalie Field, Sascha Rothe, Simon Baumgartner, Cong Yu, and Abe, Ittycheriah

PDF

Open Access

TL;DR

This paper explores transformer-based models for generating Wikipedia section titles and embeddings, demonstrating how different decoder architectures impact extractive heading generation and semantic encoding, with a new loss function enhancing embedding quality.

Contribution

It introduces a novel task of section heading generation, compares various transformer decoder architectures, and proposes a new loss function to improve section embedding quality.

Findings

01

Attention-based decoders excel at extractive heading generation.

02

Decoders without attention produce better semantic embeddings.

03

The new loss function enhances embedding quality.

Abstract

We evaluate the performance of transformer encoders with various decoders for information organization through a new task: generation of section headings for Wikipedia articles. Our analysis shows that decoders containing attention mechanisms over the encoder output achieve high-scoring results by generating extractive text. In contrast, a decoder without attention better facilitates semantic encoding and can be used to generate section embeddings. We additionally introduce a new loss function, which further encourages the decoder to generate high-quality embeddings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Wikis in Education and Collaboration

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Byte Pair Encoding