TL;DR
LaPA$^2$ introduces a length-aware, training-free framework that enhances controllability in long-form text generation by counteracting attention dilution effects, applicable to various prefix-based methods.
Contribution
It proposes a novel length-aware scaling technique and optional reinforcement mechanism to maintain control in long sequences, supporting both soft and hard prefixes.
Findings
Improves attribute controllability in long-form generation.
Maintains content relevance and fluency with LaPA$^2$ enhancements.
Supports both soft and hard prefix methods.
Abstract
Prefix-based methods have emerged as a promising paradigm for Controllable Text Generation (CTG) due to their parameter efficiency. However, while effective in short sequences, their controllability tends to diminish as the generated sequence grows. In this paper, we identify Attention Dilution as a key factor behind this phenomenon: as the sequence length increases, the attention allocated to the control signal naturally decays due to the softmax mechanism, leading to a "fading" control effect. To address this, we propose LaPA (Length-aware Prefix and Prompt Attention Augmentation), a training-free and model-agnostic framework designed to sustain robust control in long contexts. Specifically, LaPA employs Length-Aware Logarithmic Scaling to dynamically amplify prefix attention weights, mathematically counteracting the dilution effect, while an optional Contextual Anchor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
