BIM: Block-Wise Self-Supervised Learning with Masked Image Modeling
Yixuan Luo, Mengye Ren, Sai Qian Zhang

TL;DR
BIM introduces a block-wise masked image modeling framework that reduces memory usage and enables efficient training of multiple neural network backbones simultaneously, making MIM more accessible for resource-constrained environments.
Contribution
The paper proposes a novel block-wise training method for MIM that decreases memory requirements and allows concurrent training of diverse DNN architectures.
Findings
Maintains high performance with reduced memory consumption.
Enables simultaneous training of multiple DNN backbones.
Reduces computational costs compared to traditional MIM training.
Abstract
Like masked language modeling (MLM) in natural language processing, masked image modeling (MIM) aims to extract valuable insights from image patches to enhance the feature extraction capabilities of the underlying deep neural network (DNN). Contrasted with other training paradigms like supervised learning and unsupervised contrastive learning, masked image modeling (MIM) pretraining typically demands significant computational resources in order to manage large training data batches (e.g., 4096). The significant memory and computation requirements pose a considerable challenge to its broad adoption. To mitigate this, we introduce a novel learning framework, termed~\textit{Block-Wise Masked Image Modeling} (BIM). This framework involves decomposing the MIM tasks into several sub-tasks with independent computation patterns, resulting in block-wise back-propagation operations instead of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Infrastructure Maintenance and Monitoring
MethodsMutual Information Machine/Mask Image Modeling
