TiDAR: Think in Diffusion, Talk in Autoregression
Jingyu Liu, Xin Dong, Zhifan Ye, Rishabh Mehta, Yonggan Fu, Vartika Singh, Jan Kautz, Ce Zhang, Pavlo Molchanov

TL;DR
TiDAR is a novel hybrid language model architecture that combines diffusion and autoregressive methods within a single pass, achieving high throughput and quality comparable to autoregressive models.
Contribution
TiDAR introduces a sequence-level hybrid architecture with structured attention masks, balancing diffusion and autoregressive decoding in a single forward pass.
Findings
Outperforms speculative decoding in throughput
Surpasses diffusion models in efficiency and quality
Closes the quality gap with autoregressive models
Abstract
Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language modeling. This raises a fundamental question: can we achieve a synergy with high throughput, higher GPU utilization, and AR level quality? Existing methods fail to effectively balance these two aspects, either prioritizing AR using a weaker model for sequential drafting (speculative decoding), leading to lower drafting efficiency, or using some form of left-to-right (AR-like) decoding logic for diffusion, which still suffers from quality degradation and forfeits its potential parallelizability. We introduce TiDAR, a sequence-level hybrid architecture that drafts tokens (Thinking) in Diffusion and samples final outputs (Talking) AutoRegressively - all within a single forward pass using specially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)· youtube
Taxonomy
TopicsParallel Computing and Optimization Techniques · Natural Language Processing Techniques · Generative Adversarial Networks and Image Synthesis
