HydraSum: Disentangling Stylistic Features in Text Summarization using   Multi-Decoder Models

Tanya Goyal; Nazneen Fatema Rajani; Wenhao Liu; Wojciech; Kry\'sci\'nski

arXiv:2110.04400·cs.CL·October 24, 2022

HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models

Tanya Goyal, Nazneen Fatema Rajani, Wenhao Liu, Wojciech, Kry\'sci\'nski

PDF

Open Access 1 Repo

TL;DR

HydraSum introduces a multi-decoder summarization model that automatically learns and controls diverse summary styles without extra supervision, enabling flexible and stylistically varied outputs.

Contribution

The paper proposes HydraSum, a multi-decoder architecture that disentangles stylistic features in summarization, allowing explicit style control and diversity without additional supervision.

Findings

01

HydraSum outperforms baseline models on three datasets.

02

Decoders learn contrasting styles automatically.

03

Modified gating enforces specific style partitions.

Abstract

Summarization systems make numerous "decisions" about summary properties during inference, e.g. degree of copying, specificity and length of outputs, etc. However, these are implicitly encoded within model parameters and specific styles cannot be enforced. To address this, we introduce HydraSum, a new summarization architecture that extends the single decoder framework of current models to a mixture-of-experts version with multiple decoders. We show that HydraSum's multiple decoders automatically learn contrasting summary styles when trained under the standard training objective without any extra supervision. Through experiments on three summarization datasets (CNN, Newsroom and XSum), we show that HydraSum provides a simple mechanism to obtain stylistically-diverse summaries by sampling from either individual decoders or their mixtures, outperforming baseline models. Finally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

salesforce/hydra-sum
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Softmax · Dropout · Dense Connections · Layer Normalization · Byte Pair Encoding · Refunds@Expedia|||How do I get a full refund from Expedia?