MaskAnyNet: Rethinking Masked Image Regions as Valuable Information in Supervised Learning
Jingshan Hong, Haigen Hu, Huihuang Zhang, Qianwei Zhou, Zhao Li

TL;DR
MaskAnyNet introduces a novel approach that treats masked image regions as valuable sources of semantic information, enhancing supervised learning by leveraging both visible and masked content for better feature representation.
Contribution
The paper proposes MaskAnyNet, a method that redefines masked regions as auxiliary knowledge, improving feature richness and fine-grained detail preservation in supervised learning.
Findings
Consistent performance improvements across CNN and Transformer models.
Enhanced semantic diversity through masked content reuse.
Better preservation of fine-grained details in images.
Abstract
In supervised learning, traditional image masking faces two key issues: (i) discarded pixels are underutilized, leading to a loss of valuable contextual information; (ii) masking may remove small or critical features, especially in fine-grained tasks. In contrast, masked image modeling (MIM) has demonstrated that masked regions can be reconstructed from partial input, revealing that even incomplete data can exhibit strong contextual consistency with the original image. This highlights the potential of masked regions as sources of semantic diversity. Motivated by this, we revisit the image masking approach, proposing to treat masked content as auxiliary knowledge rather than ignored. Based on this, we propose MaskAnyNet, which combines masking with a relearning mechanism to exploit both visible and masked information. It can be easily extended to any model with an additional branch to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Image Enhancement Techniques
