Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Zesen Cheng, Hang Zhang, Kehan Li, Sicong Leng, Zhiqiang Hu, Fei Wu,, Deli Zhao, Xin Li, Lidong Bing

TL;DR
This paper introduces a novel tiling-based computation method that significantly reduces memory requirements for contrastive loss, enabling near-infinite batch size scaling without accuracy loss.
Contribution
It presents a multi-level tiling strategy and optimized communication techniques to scale contrastive learning to unprecedented batch sizes efficiently.
Findings
Enables training with batch sizes of 4M or 12M on large models.
Achieves two orders of magnitude memory reduction compared to state-of-the-art methods.
Maintains comparable speed and accuracy at massive batch scales.
Abstract
Contrastive loss is a powerful approach for representation learning, where larger batch sizes enhance performance by providing more negative samples to better distinguish between similar and dissimilar data. However, scaling batch sizes is constrained by the quadratic growth in GPU memory consumption, primarily due to the full instantiation of the similarity matrix. To address this, we propose a tile-based computation strategy that partitions the contrastive loss calculation into arbitrary small blocks, avoiding full materialization of the similarity matrix. Furthermore, we introduce a multi-level tiling strategy to leverage the hierarchical structure of distributed systems, employing ring-based communication at the GPU level to optimize synchronization and fused kernels at the CUDA core level to reduce I/O overhead. Experimental results show that the proposed method scales batch sizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices
