ELT: Elastic Looped Transformers for Visual Generation

Sahil Goyal; Swayam Agrawal; Gautham Govind Anil; Prateek Jain; Sujoy Paul; Aditya Kusupati

arXiv:2604.09168·cs.CV·April 14, 2026

ELT: Elastic Looped Transformers for Visual Generation

Sahil Goyal, Swayam Agrawal, Gautham Govind Anil, Prateek Jain, Sujoy Paul, Aditya Kusupati

PDF

2 Repos

TL;DR

ELT introduces a parameter-efficient recurrent transformer architecture with iterative weight sharing and intra-loop self distillation, enabling high-quality image and video generation with dynamic trade-offs.

Contribution

The paper proposes Elastic Looped Transformers (ELT), a novel recurrent transformer model with weight sharing and ILSD, achieving efficient visual synthesis with flexible inference options.

Findings

01

Achieves 4x parameter reduction with comparable FID of 2.0 on ImageNet 256x256.

02

Introduces intra-loop self distillation for effective training of shared-parameter models.

03

Enables any-time inference with dynamic quality-computation trade-offs.

Abstract

We introduce Elastic Looped Transformers (ELT), a highly parameter-efficient class of visual generative models based on a recurrent transformer architecture. While conventional generative models rely on deep stacks of unique transformer layers, our approach employs iterative, weight-shared transformer blocks to drastically reduce parameter counts while maintaining high synthesis quality. To effectively train these models for image and video generation, we propose the idea of Intra-Loop Self Distillation (ILSD), where student configurations (intermediate loops) are distilled from the teacher configuration (maximum training loops) to ensure consistency across the model's depth in a single training step. Our framework yields a family of elastic models from a single training run, enabling Any-Time inference capability with dynamic trade-offs between computational cost and generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.