Hypernetworks build Implicit Neural Representations of Sounds
Filip Szatkowski, Karol J. Piczak, Przemys{\l}aw Spurek, Jacek Tabor,, Tomasz Trzci\'nski

TL;DR
This paper introduces HyperSound, a novel meta-learning hypernetwork approach that generates implicit neural representations for audio, overcoming visual data biases and achieving high-quality sound reconstruction.
Contribution
The paper presents HyperSound, the first hypernetwork-based method for creating INRs for audio, enabling better generalization and quality in sound representation.
Findings
Achieves comparable quality to state-of-the-art audio models
Provides a new hypernetwork-based approach for audio INRs
Demonstrates effective generalization beyond training samples
Abstract
Implicit Neural Representations (INRs) are nowadays used to represent multimedia signals across various real-life applications, including image super-resolution, image compression, or 3D rendering. Existing methods that leverage INRs are predominantly focused on visual data, as their application to other modalities, such as audio, is nontrivial due to the inductive biases present in architectural attributes of image-based INR models. To address this limitation, we introduce HyperSound, the first meta-learning approach to produce INRs for audio samples that leverages hypernetworks to generalize beyond samples observed in training. Our approach reconstructs audio samples with quality comparable to other state-of-the-art models and provides a viable alternative to contemporary sound representations used in deep neural networks for audio processing, such as spectrograms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Speech and Audio Processing · Music and Audio Processing
