Zero-Reference Joint Low-Light Enhancement and Deblurring via Visual Autoregressive Modeling with VLM-Derived Modulation
Wei Dong, Han Zhou, Junwei Lin, Jun Chen

TL;DR
This paper introduces an unsupervised generative framework that combines visual autoregressive modeling with vision-language priors to enhance and deblur low-light images without paired data, addressing complex noise and blur.
Contribution
It proposes a novel VAR-based model guided by VLM-derived priors, incorporating adaptive illumination modulation, spatial-frequency-aware encodings, and recursive phase refinement for robust low-light image restoration.
Findings
Achieves state-of-the-art results on benchmark datasets.
Effectively models dynamic illumination and blur without paired supervision.
Reduces artifacts through recursive phase-domain refinement.
Abstract
Real-world dark images commonly exhibit not only low visibility and contrast but also complex noise and blur, posing significant restoration challenges. Existing methods often rely on paired data or fail to model dynamic illumination and blur characteristics, leading to poor generalization. To tackle this, we propose a generative framework based on visual autoregressive (VAR) modeling, guided by perceptual priors from the vision-language model (VLM). Specifically, to supply informative conditioning cues for VAR models, we deploy an adaptive curve estimation scheme to modulate the diverse illumination based on VLM-derived visibility scores. In addition, we integrate dynamic and spatial-frequency-aware Rotary Positional Encodings (SF-RoPE) into VAR to enhance its ability to model structures degraded by blur. Furthermore, we propose a recursive phase-domain modulation strategy that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Enhancement Techniques · Advanced Image Processing Techniques · Image and Video Quality Assessment
