Using Random Codebooks for Audio Neural AutoEncoders

Beno\^it Gini\`es (S2A); Xiaoyu Bie (S2A); Olivier Fercoq (S2A),; Ga\"el Richard (S2A)

arXiv:2409.16677·eess.SP·September 26, 2024·EUSIPCO

Using Random Codebooks for Audio Neural AutoEncoders

Beno\^it Gini\`es (S2A), Xiaoyu Bie (S2A), Olivier Fercoq (S2A),, Ga\"el Richard (S2A)

PDF

TL;DR

This paper introduces a novel approach for audio compression using neural autoencoders with random codebooks for discrete representation, demonstrating promising results in audio reconstruction tasks.

Contribution

The paper proposes a new method of building neural discrete representations with random codebooks, advancing audio compression techniques.

Findings

01

Effective audio reconstruction with random codebooks

02

Potential for improved data representation in neural autoencoders

03

Demonstrated advantages over traditional quantization methods

Abstract

Latent representation learning has been an active field of study for decades in numerous applications. Inspired among others by the tokenization from Natural Language Processing and motivated by the research of a simple data representation, recent works have introduced a quantization step into the feature extraction. In this work, we propose a novel strategy to build the neural discrete representation by means of random codebooks. These codebooks are obtained by randomly sampling a large, predefined fixed codebook. We experimentally show the merits and potential of our approach in a task of audio compression and reconstruction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.