BRIDLE: Generalized Self-supervised Learning with Quantization
Hoang M. Nguyen, Satya N. Shukla, Qiang Zhang, Hanchao Yu, Sreya D., Roy, Taipeng Tian, Lingjiong Zhu, Yuchen Liu

TL;DR
BRIDLE introduces a hierarchical residual quantization approach within a self-supervised framework, improving representation quality across audio, image, and video modalities and achieving state-of-the-art results in audio understanding tasks.
Contribution
It proposes a novel residual quantization method with multiple codebooks for self-supervised learning, generalizing to various data modalities and enhancing downstream task performance.
Findings
Achieves state-of-the-art results on audio classification benchmarks.
Demonstrates improved downstream performance over traditional VQ methods.
Shows competitive results on image and video classification tasks.
Abstract
Self-supervised learning has been a powerful approach for learning meaningful representations from unlabeled data across various domains, reducing the reliance on large labeled datasets. Inspired by BERT's success in capturing deep bidirectional contexts in natural language processing, similar frameworks have been adapted to other modalities such as audio, with models like BEATs extending the bidirectional training paradigm to audio signals using vector quantization (VQ). However, these frameworks face challenges, notably their dependence on a single codebook for quantization, which may not capture the complex, multifaceted nature of signals. In addition, inefficiencies in codebook utilization lead to underutilized code vectors. To address these limitations, we introduce BRIDLE (Bidirectional Residual Quantization Interleaved Discrete Learning Encoder), a self-supervised encoder…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
