Single-Shot HDR Recovery via a Video Diffusion Prior
Chinmay Talegaonkar, Jinshi He, Christopher McKenna, Nicholas Antipa

TL;DR
This paper introduces a novel, interpretable method for single-shot HDR image reconstruction by finetuning a video diffusion model to generate exposure brackets conditioned on an LDR input, then fusing them into a high-fidelity HDR image.
Contribution
It reformulates HDR reconstruction as conditional video generation and fusion, eliminating the need for multiple models and improving fidelity and interpretability.
Findings
Outperforms state-of-the-art generative methods on benchmarks
Achieves higher human preference in pairwise comparisons
Extends framework to other image reconstruction tasks
Abstract
Recent generative methods for single-shot high dynamic range (HDR) image reconstruction show promising results, but often struggle with preserving fidelity to the input image. They require separate models to handle highlights and shadows, or sacrifice interpretability by directly predicting the final HDR image. We address these limitations by re-casting single-shot HDR reconstruction as conditional video generation and fusing the generated frames into an HDR image. We finetune a video diffusion model to generate an exposure bracket, conditioned on a low dynamic range (LDR) input. We fuse this image bracket using per-pixel weights predicted by a light-weight UNet. This formulation is simple, interpretable, and effective. Rather than directly hallucinating an HDR image, it explicitly reconstructs the intermediate exposure stack and fuses it into the final output. Our method eliminates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
