Thoughts on Objectives of Sparse and Hierarchical Masked Image Model
Asahi Miyazaki, Tsuyoshi Okita

TL;DR
This paper investigates the impact of different mask patterns on the performance of the SparK masked image model, proposing a new Mesh Mask-ed SparK variant to improve self-supervised learning outcomes.
Contribution
It introduces a novel mask pattern for SparK, called Mesh Mask, and analyzes its effects on pre-training performance in masked image modeling.
Findings
Mesh Mask improves pre-training effectiveness
Mask pattern choice significantly affects model performance
Proposed pattern outperforms previous masking strategies
Abstract
Masked image modeling is one of the most poplular objectives of training. Recently, the SparK model has been proposed with superior performance among self-supervised learning models. This paper proposes a new mask pattern for this SparK model, proposing it as the Mesh Mask-ed SparK model. We report the effect of the mask pattern used for image masking in pre-training on performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media and Visual Art
