PLANET: Dynamic Content Planning in Autoregressive Transformers for Long-form Text Generation
Zhe Hu, Hou Pong Chan, Jiachen Liu, Xinyan Xiao, Hua Wu, Lifu Huang

TL;DR
PLANET introduces a dynamic content planning framework for long-form text generation using autoregressive Transformers, significantly improving coherence and content richness through latent semantic planning and contrastive learning.
Contribution
The paper presents a novel framework that integrates dynamic content planning into Transformer-based models, enhancing coherence and content control in long-form text generation.
Findings
Outperforms strong baselines in coherence and content richness.
Improves long-form text coherence via contrastive learning.
Effective in counterargument and opinion article generation.
Abstract
Despite recent progress of pre-trained language models on generating fluent text, existing methods still suffer from incoherence problems in long-form text generation tasks that require proper content control and planning to form a coherent high-level logical flow. In this work, we propose PLANET, a novel generation framework leveraging autoregressive self-attention mechanism to conduct content planning and surface realization dynamically. To guide the generation of output sentences, our framework enriches the Transformer decoder with latent representations to maintain sentence-level semantic plans grounded by bag-of-words. Moreover, we introduce a new coherence-based contrastive learning objective to further improve the coherence of output. Extensive experiments are conducted on two challenging long-form text generation tasks including counterargument generation and opinion article…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Contrastive Learning · Byte Pair Encoding · Softmax · Residual Connection · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing
