NanoSD: Edge Efficient Foundation Model for Real Time Image Restoration

Subhajit Sanyal; Srinivas Soumitri Miriyala; Akshay Janardan Bankar; Manjunath Arveti; Sowmya Vajrala; Shreyas Pandith; Sravanth Kodavanti; Abhishek Ameta; Harshit; Amit Satish Unde

arXiv:2601.09823·cs.CV·January 19, 2026

NanoSD: Edge Efficient Foundation Model for Real Time Image Restoration

Subhajit Sanyal, Srinivas Soumitri Miriyala, Akshay Janardan Bankar, Manjunath Arveti, Sowmya Vajrala, Shreyas Pandith, Sravanth Kodavanti, Abhishek Ameta, Harshit, Amit Satish Unde

PDF

Open Access

TL;DR

NanoSD is a family of optimized diffusion models designed for real-time image restoration on edge devices, balancing accuracy, latency, and size through innovative architectural and training techniques.

Contribution

NanoSD introduces a novel co-design and distillation approach to create lightweight diffusion models that maintain generative priors and enable real-time performance on edge hardware.

Findings

01

NanoSD achieves real-time inference (20ms) on mobile NPUs.

02

NanoSD outperforms prior lightweight diffusion models in quality and efficiency.

03

NanoSD supports multiple image restoration tasks with state-of-the-art results.

Abstract

Latent diffusion models such as Stable Diffusion 1.5 offer strong generative priors that are highly valuable for image restoration, yet their full pipelines remain too computationally heavy for deployment on edge devices. Existing lightweight variants predominantly compress the denoising U-Net or reduce the diffusion trajectory, which disrupts the underlying latent manifold and limits generalization beyond a single task. We introduce NanoSD, a family of Pareto-optimal diffusion foundation models distilled from Stable Diffusion 1.5 through network surgery, feature-wise generative distillation, and structured architectural scaling jointly applied to the U-Net and the VAE encoder-decoder. This full-pipeline co-design preserves the generative prior while producing models that occupy distinct operating points along the accuracy-latency-size frontier (e.g., 130M-315M parameters, achieving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Face recognition and analysis