Just Noticeable Visual Redundancy Forecasting: A Deep Multimodal-driven Approach
Wuyuan Xie, Shukang Wang, Sukun Tian, Lirong Huang, Ye Liu, Miaohui, Wang

TL;DR
This paper introduces hmJND-Net, a deep multimodal approach for modeling just noticeable difference in visual perception by integrating saliency, depth, and segmentation information, outperforming existing single-modality methods.
Contribution
The paper presents a novel end-to-end multimodal framework that effectively fuses and aligns multiple visual modalities for improved JND prediction.
Findings
Outperforms eight benchmark methods in experiments
Effective multimodal fusion via summation and subtraction
Validated on eight datasets
Abstract
Just noticeable difference (JND) refers to the maximum visual change that human eyes cannot perceive, and it has a wide range of applications in multimedia systems. However, most existing JND approaches only focus on a single modality, and rarely consider the complementary effects of multimodal information. In this article, we investigate the JND modeling from an end-to-end homologous multimodal perspective, namely hmJND-Net. Specifically, we explore three important visually sensitive modalities, including saliency, depth, and segmentation. To better utilize homologous multimodal information, we establish an effective fusion method via summation enhancement and subtractive offset, and align homologous multimodal features based on a self-attention driven encoder-decoder paradigm. Extensive experimental results on eight different benchmark datasets validate the superiority of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image Fusion Techniques · Image and Video Quality Assessment
MethodsALIGN
