TL;DR
RDDM introduces a RAW domain diffusion model that directly restores photo-realistic images from sensor RAW data, overcoming limitations of sRGB-based methods by improving fidelity and reducing artifacts in real-world image restoration.
Contribution
The paper proposes a novel RAW domain diffusion model with a specialized VAE and multi-bayer adaptation, enabling direct RAW image restoration and addressing domain distribution challenges.
Findings
RDDM outperforms state-of-the-art sRGB diffusion methods in fidelity.
The model reduces artifacts compared to existing approaches.
Extensive experiments validate the effectiveness of the proposed method.
Abstract
We present the RAW domain diffusion model (RDDM), an end-to-end diffusion model that restores photo-realistic images directly from the sensor RAW data. While recent sRGB-domain diffusion methods achieve impressive results, they are caught in a dilemma between high fidelity and image generation. These models process lossy sRGB inputs and neglect the accessibility of the sensor RAW images in many scenarios, e.g., in image and video capturing in edge devices, resulting in sub-optimal performance. RDDM obviates this limitation by directly restoring images in the RAW domain, replacing the conventional two-stage image signal processing (ISP)->Image Restoration (IR) pipeline. However, a simple adaptation of pre-trained diffusion models to the RAW domain confronts many challenges. To this end, we propose: (1) a RAW-domain VAE (RVAE), encoding sensor RAW and decoding it into an enhanced linear…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
+ The combination of RAW-domain VAE, LoRA-based Bayer adaptation, and dual-domain supervision reflects careful engineering to handle distribution gaps and data diversity in sensor formats. + The authors provide a broad set of quantitative, qualitative, and user studies, covering both synthetic and in-the-wild datasets, and analyze component effectiveness through ablations on losses, VAE variants, and prompt extractors.
- Overly complicated and fragmented pipeline. The proposed system consists of multiple coupled modules. The overall architecture is highly procedural and difficult to apply or reason about conceptually. Much of the technical novelty lies in stitching existing components together rather than introducing a coherent or elegant new formulation. - Limited conceptual novelty. The work mainly transfers existing ideas (latent diffusion, LoRA, ISP simulation, prompt-based conditioning) into the RAW doma
1. Sufficient presentation of objective metrics and subjective visualization results in the experiments: 2. Clear description of the methodology and contributions:
1. As shown in Table 1, many results in the comparison with SOTA methods are suboptimal. Does this imply that the current solution still has limitations, and that more relevant analysis and experiments should be provided? 2. It is important to note that a VAE's optimal reconstruction results are not indicative of superior generation capability, given the frequent trade-off between these two objectives. Therefore, in addition to the reconstruction performance presented in Table 3, an evaluation o
1. The proposed idea of using latent diffusion models for RAW domain restoration is interesting. 2. The qualitative results have more detail and higher fidelity than existing methods.
1. Hard to understand: The flow and explanation of the methodology section is very difficult to follow (see questions). 2. Unclear motivation: From my understanding, the paper proposes a method that restores a LQ RAW image ($X_L^\text{RAW}$) to its HQ counterpart in RGB space ($X_H^\text{rgb}$). Existing methods directly perform restoration on the LQ image in RGB space. The methodology in the paper depends on converting the LQ image from RGB space to the RAW space and then performing restoration
1. The paper is highly original in proposing the first RAW-domain diffusion model (RDDM) for real-world image restoration. Unlike prior works limited to the sRGB domain, it innovatively leverages unprocessed sensor RAW data, capturing richer signal information and eliminating dependence on handcrafted ISP pipelines. The integration of diffusion priors into the RAW domain represents a clear conceptual advance. 2. The technical design is solid and well-justified. The introduction of the RAW Variat
1. While RDDM shows strong results on several benchmarks, the experiments mainly focus on synthetic or semi-synthetic datasets derived from sRGB sources. The paper would be stronger with more evaluation on diverse real RAW datasets captured by different sensors and under varied lighting conditions to verify cross-camera robustness. 2. The proposed RAW data synthesis pipeline is creative but may not fully capture real-world sensor noise characteristics or color responses. The paper lacks quantita
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
