Self-Attentive Model for Headline Generation
Daniil Gavrilov, Pavel Kalaidin, Valentin Malykh

TL;DR
This paper introduces a self-attentive Universal Transformer model with byte-pair encoding for headline generation, achieving state-of-the-art results on multiple news corpora by enhancing natural language reasoning capabilities.
Contribution
The paper presents a novel application of Universal Transformer architecture combined with byte-pair encoding to improve headline generation performance.
Findings
Achieved new state-of-the-art ROUGE scores on the New York Times corpus.
Introduced the RIA corpus for headline generation evaluation.
Demonstrated improved reasoning in headline generation models.
Abstract
Headline generation is a special type of text summarization task. While the amount of available training data for this task is almost unlimited, it still remains challenging, as learning to generate headlines for news articles implies that the model has strong reasoning about natural language. To overcome this issue, we applied recent Universal Transformer architecture paired with byte-pair encoding technique and achieved new state-of-the-art results on the New York Times Annotated corpus with ROUGE-L F1-score 24.84 and ROUGE-2 F1-score 13.48. We also present the new RIA corpus and reach ROUGE-L F1-score 36.81 and ROUGE-2 F1-score 22.15 on it.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Attention Dropout · Universal Transformer · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia?
