ACDC: Autoregressive Coherent Multimodal Generation using Diffusion   Correction

Hyungjin Chung; Dohun Lee; Jong Chul Ye

arXiv:2410.04721·cs.LG·October 8, 2024

ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction

Hyungjin Chung, Dohun Lee, Jong Chul Ye

PDF

Open Access

TL;DR

ACDC is a novel zero-shot method that combines autoregressive and diffusion models to improve long-sequence multimodal generation by correcting artifacts and preserving global context without additional training.

Contribution

The paper introduces a memory-augmented approach that integrates ARMs and DMs at inference time, enabling high-quality, coherent multimodal generation without fine-tuning.

Findings

01

Effective error mitigation in long-sequence multimodal generation

02

Significant quality improvements over baseline models

03

Versatile across different architectures and tasks

Abstract

Autoregressive models (ARMs) and diffusion models (DMs) represent two leading paradigms in generative modeling, each excelling in distinct areas: ARMs in global context modeling and long-sequence generation, and DMs in generating high-quality local contexts, especially for continuous data such as images and short videos. However, ARMs often suffer from exponential error accumulation over long sequences, leading to physically implausible results, while DMs are limited by their local context generation capabilities. In this work, we introduce Autoregressive Coherent multimodal generation with Diffusion Correction (ACDC), a zero-shot approach that combines the strengths of both ARMs and DMs at the inference stage without the need for additional fine-tuning. ACDC leverages ARMs for global context generation and memory-conditioned DMs for local correction, ensuring high-quality outputs by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMagnetic Bearings and Levitation Dynamics

MethodsDiffusion