Beyond In-Place Corruption: Insertion and Deletion In Denoising   Probabilistic Models

Daniel D. Johnson; Jacob Austin; Rianne van den Berg; Daniel Tarlow

arXiv:2107.07675·cs.LG·July 19, 2021

Beyond In-Place Corruption: Insertion and Deletion In Denoising Probabilistic Models

Daniel D. Johnson, Jacob Austin, Rianne van den Berg, Daniel Tarlow

PDF

Open Access

TL;DR

This paper extends denoising diffusion models for sequence data to include insertion and deletion operations, enabling more flexible corruption processes and improving performance on tasks like spelling correction.

Contribution

It introduces a new class of denoising models that handle insertions and deletions, surpassing traditional in-place models in sequence generation tasks.

Findings

01

Outperform standard in-place models on arithmetic sequences

02

Can correct spelling errors without fine-tuning on text8

03

Efficient training and sampling with broader corruption processes

Abstract

Denoising diffusion probabilistic models (DDPMs) have shown impressive results on sequence generation by iteratively corrupting each example and then learning to map corrupted versions back to the original. However, previous work has largely focused on in-place corruption, adding noise to each pixel or token individually while keeping their locations the same. In this work, we consider a broader class of corruption processes and denoising models over sequence data that can insert and delete elements, while still being efficient to train and sample from. We demonstrate that these models outperform standard in-place models on an arithmetic sequence task, and that when trained on the text8 dataset they can be used to fix spelling errors without any fine-tuning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Music and Audio Processing

MethodsDiffusion