Seeing Through the PRISM: Compound & Controllable Restoration of Scientific Images
Rupa Kurinchi-Vendhan, Pratyusha Sharma, Antonio Torralba, Sara Beery

TL;DR
PRISM is a novel diffusion-based framework that enables precise, controllable, and high-fidelity restoration of complex, compound degradations in scientific images, improving accuracy and interpretability across various scientific domains.
Contribution
It introduces a prompted conditional diffusion model with a contrastive disentanglement objective for simultaneous, selective removal of multiple image degradations.
Findings
Outperforms state-of-the-art methods on complex degradations.
Enables zero-shot removal of unseen degradation mixtures.
Improves downstream scientific analysis accuracy.
Abstract
Scientific and environmental imagery often suffer from complex mixtures of noise related to the sensor and the environment. Existing restoration methods typically remove one degradation at a time, leading to cascading artifacts, overcorrection, or loss of meaningful signal. In scientific applications, restoration must be able to simultaneously handle compound degradations while allowing experts to selectively remove subsets of distortions without erasing important features. To address these challenges, we present PRISM (Precision Restoration with Interpretable Separation of Mixtures). PRISM is a prompted conditional diffusion framework which combines compound-aware supervision over mixed degradations with a weighted contrastive disentanglement objective that aligns primitives and their mixtures in the latent space. This compositional geometry enables high-fidelity joint removal of…
Peer Reviews
Decision·ICLR 2026 Poster
1) Clear objective and method design. The paper argues for simultaneous rather than sequential restoration, emphasizes expert control, and focuses on scientific fidelity rather than aesthetics. The architecture coherently combines contrastive disentanglement, prompt-conditioned latent diffusion, and SCPM for detail recovery. 2) Good reported performance and breadth. On MDB, PRISM outperforms representative all-in-one and diffusion/composite baselines (e.g., AirNet, Restormer, NAFNet, PromptIR,
1) Control granularity and evaluation scope: The evaluation largely uses manual prompting with a pre-defined set of distortion types, not open-ended language or fine-grained controls. The paper itself notes that extending controllability beyond “which distortions to remove” to specifying intensity and spatial extent is left for future work. This leaves unanswered how robust the system is to realistic prompt variations or local/severity-aware edits. 2) Synthetic-to-real gap and capped composit
* Image restoration is a critical task, particularly for scientific applications. This paper demonstrates the method's effectiveness through general purpose image restoration, evaluated using fidelity and perceptual metrics, and its application for downstream scientific tasks. * The motivation is written clearly, and the figures (although 1 and 2 are not referenced) support the understanding of the general approach. * Although the number of consecutive distortions in the training set is limited
* While the fine tuning of CLIP image encoder is explained thoroughly, the following steps of how SD 1.5 is used as the backbone and the suggested SCPM module are explained only briefly. This impairs the understanding of the entire framework, and while the code is submitted, the text itself is not sufficient to reproduce the code. * The concept of automatic restoration needs clarification. While the paragraph on prompting (line 207) describes the automatic transformation of natural prompts to fi
The work compellingly argues for the necessity of controllable, selective restoration over automated 'full' restoration for scientific applications, demonstrating significant gains in downstream task utility.
The primary methodological concern is the limited novelty. The core idea of fine-tuning a CLIP encoder to be degradation-aware heavily relies on prior work (e.g., DA-CLIP). The main novelty appears to rest on the Jaccard distance weighting in the contrastive loss, but the paper lacks a direct ablation comparing this to an unweighted compound contrastive loss, making it difficult to isolate its true impact. Second, the two-stage training pipeline is computationally complex, and the choice of a
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Enhancement Techniques · Cell Image Analysis Techniques · Advanced Image Processing Techniques
