Resurrecting Submodularity for Neural Text Generation
Simeng Han, Xiang Lin, Shafiq Joty

TL;DR
This paper introduces a novel attention mechanism based on submodular functions to enhance neural text generation, improving coverage and quality across multiple tasks and architectures.
Contribution
It defines diminishing attentions with submodular functions and proposes a simple, effective attention module that improves neural text coverage without complex algorithms.
Findings
Outperforms state-of-the-art baselines in multiple tasks
Generalizes across different neural architectures and training strategies
Produces higher coverage and quality in generated texts
Abstract
Submodularity is desirable for a variety of objectives in content selection where the current neural encoder-decoder framework is inadequate. However, it has so far not been explored in the neural encoder-decoder system for text generation. In this work, we define diminishing attentions with submodular functions and in turn, prove the submodularity of the effective neural coverage. The greedy algorithm approximating the solution to the submodular maximization problem is not suited to attention score optimization in auto-regressive generation. Therefore instead of following how submodular function has been widely used, we propose a simplified yet principled solution. The resulting attention module offers an architecturally simple and empirically effective method to improve the coverage of neural text generation. We run experiments on three directed text generation tasks with different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · WordPiece · Linear Warmup With Linear Decay · BERT · Residual Connection
