A Real Time Super Resolution Accelerator with Tilted Layer Fusion
An-Jung Huang, Kai-Chieh Hsu, Tian-Sheuan Chang

TL;DR
This paper presents a real-time super resolution hardware accelerator that significantly reduces memory bandwidth and on-chip memory requirements, enabling high-resolution processing on mobile devices.
Contribution
It introduces a tilted layer fusion method and a hardware design that achieves high throughput with minimal memory usage, outperforming previous accelerators.
Findings
92% reduction in external DRAM bandwidth
Achieves 1920x1080@60fps processing
Uses only 102KB on-chip memory
Abstract
Deep learning based superresolution achieves high-quality results, but its heavy computational workload, large buffer, and high external memory bandwidth inhibit its usage in mobile devices. To solve the above issues, this paper proposes a real-time hardware accelerator with the tilted layer fusion method that reduces the external DRAM bandwidth by 92\% and just needs 102KB on-chip memory. The design implemented with a 40nm CMOS process achieves 1920x1080@60fps throughput with 544.3K gate count when running at 600MHz; it has higher throughput and lower area cost than previous designs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Integrated Circuits and Semiconductor Failure Analysis · Photonic and Optical Devices
