MetaSR: Content-Adaptive Metadata Orchestration for Generative Super-Resolution

Jiaqi Guo; Mingzhen Li; Haohong Wang; Aggelos K. Katsaggelos

arXiv:2604.26244·cs.CV·April 30, 2026

MetaSR: Content-Adaptive Metadata Orchestration for Generative Super-Resolution

Jiaqi Guo, Mingzhen Li, Haohong Wang, Aggelos K. Katsaggelos

PDF

TL;DR

MetaSR is a novel framework that adaptively selects and injects relevant metadata into diffusion-based super-resolution models, significantly improving quality and efficiency across diverse content types under resource constraints.

Contribution

It introduces a content-adaptive metadata orchestration method using a Diffusion Transformer that outperforms fixed-guidance approaches in diverse real-world scenarios.

Findings

01

MetaSR achieves up to 1.0 dB PSNR improvement over baselines.

02

It reduces transmission bitrate by up to 50% at the same quality levels.

03

Experiments demonstrate effectiveness across various content and degradation types.

Abstract

We study generative super-resolution (SR) in real-world scenarios where content and degradations vary across domains, genres, and segments. For example, images and videos may alternate between text overlays, fast motion, smooth cartoons, and low-light faces, each benefiting from different forms of side information. Existing metadata-guided SR methods typically use a fixed conditioning design, which is suboptimal when useful cues are content dependent and transmission budgets are limited. We propose MetaSR, a Diffusion Transformer (DiT)-based framework that selects and injects task-relevant metadata to guide SR under resource constraints. Specifically, we use the DiT's own VAE and transformer backbone to fuse heterogeneous metadata, and adopt an efficient distillation strategy that enables one-step diffusion inference. Experiments across diverse content buckets and degradation regimes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.