SA-DiffuSeq: Addressing Computational and Scalability Challenges in Long-Document Generation with Sparse Attention
Alexandros Christoforos, Chadbourne Davis

TL;DR
SA-DiffuSeq introduces a sparse attention diffusion framework that significantly improves the scalability and efficiency of long document generation while maintaining high quality and coherence.
Contribution
The paper presents a novel sparse attention mechanism within diffusion models, enhancing long document generation efficiency and stability compared to existing methods.
Findings
Outperforms state-of-the-art diffusion models in training efficiency
Achieves faster sampling speeds on extended sequences
Maintains semantic coherence and quality in long text generation
Abstract
Diffusion based approaches to long form text generation suffer from prohibitive computational cost and memory overhead as sequence length increases. We introduce SA-DiffuSeq, a diffusion framework that integrates sparse attention to fundamentally improve scalability for long document modeling. By selectively allocating attention within the diffusion process, SA-DiffuSeq significantly reduces computational complexity while maintaining semantic coherence and generation quality. A key component of our method is a soft absorbing state tailored to sparse attention dynamics, which stabilizes diffusion trajectories and accelerates sequence reconstruction. This design improves sampling efficiency and enhances precision in long range dependency modeling. Extensive experiments demonstrate that SA-DiffuSeq consistently surpasses state of the art diffusion baselines in both training efficiency and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Mental Health via Writing
