MoCHA-former: Moir\'e-Conditioned Hybrid Adaptive Transformer for Video Demoir\'eing
Jeahun Sung, Changhyun Roh, Chanho Eom, Jihyong Oh

TL;DR
This paper introduces MoCHA-former, a novel transformer-based model for video demoiréing that effectively addresses spatial, channel, and temporal challenges, outperforming previous methods in quality metrics.
Contribution
The paper proposes MoCHA-former, combining decoupling and spatio-temporal adaptation techniques, to improve video demoiréing by handling complex moiré patterns and temporal consistency.
Findings
Outperforms prior methods in PSNR, SSIM, and LPIPS metrics.
Effectively handles spatially varying and large-scale moiré artifacts.
Maintains temporal consistency without explicit frame alignment.
Abstract
Recent advances in portable imaging have made camera-based screen capture ubiquitous. Unfortunately, frequency aliasing between the camera's color filter array (CFA) and the display's sub-pixels induces moir\'e patterns that severely degrade captured photos and videos. Although various demoir\'eing models have been proposed to remove such moir\'e patterns, these approaches still suffer from several limitations: (i) spatially varying artifact strength within a frame, (ii) large-scale and globally spreading structures, (iii) channel-dependent statistics and (iv) rapid temporal fluctuations across frames. We address these issues with the Moir\'e Conditioned Hybrid Adaptive Transformer (MoCHA-former), which comprises two key components: Decoupled Moir\'e Adaptive Demoir\'eing (DMAD) and Spatio-Temporal Adaptive Demoir\'eing (STAD). DMAD separates moir\'e and content via a Moir\'e Decoupling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
