Coarse-Guided Visual Generation via Weighted h-Transform Sampling

Yanghao Wang; Ziqi Jiang; Zhen Wang; Long Chen

arXiv:2603.12057·cs.CV·March 31, 2026

Coarse-Guided Visual Generation via Weighted h-Transform Sampling

Yanghao Wang, Ziqi Jiang, Zhen Wang, Long Chen

PDF

1 Repo

TL;DR

This paper introduces a novel sampling-guided visual generation method using the h-transform, enabling high-quality, guided synthesis from coarse references without extensive training.

Contribution

It proposes a training-free, h-transform-based approach that effectively balances guidance and quality in visual generation tasks.

Findings

01

Effective guidance in diverse image and video generation tasks.

02

Outperforms existing training-free methods in quality and control.

03

Generalizes well across different types of visual data.

Abstract

Coarse-guided visual generation, which synthesizes fine visual samples from degraded or low-fidelity coarse references, is essential for various real-world applications. While training-based approaches are effective, they are inherently limited by high training costs and restricted generalization due to paired data collection. Accordingly, recent training-free works propose to leverage pretrained diffusion models and incorporate guidance during the sampling process. However, these training-free methods either require knowing the forward (fine-to-coarse) transformation operator, e.g., bicubic downsampling, or are difficult to balance between guidance and synthetic quality. To address these challenges, we propose a novel guided method by using the h-transform, a tool that can constrain stochastic processes (e.g., sampling process) under desired conditions. Specifically, we modify the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hkust-longgroup/Coarse-guided-Gen
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.