HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding
Darius Petermann, Seungkwon Beack, Minje Kim

TL;DR
HARP-Net introduces hyper-autoencoded skip connections in neural audio codecs, enhancing information flow and perceptual quality by using small autoencoders to facilitate data transfer between encoder-decoder layers.
Contribution
The paper proposes a novel hyper-autoencoded architecture with skip connections that improve neural audio coding performance over traditional autoencoders.
Findings
Improved perceptual audio quality compared to baseline autoencoders
Effective data transfer via small autoencoders between layers
Enhanced information flow in neural audio codecs
Abstract
An autoencoder-based codec employs quantization to turn its bottleneck layer activation into bitstrings, a process that hinders information flow between the encoder and decoder parts. To circumvent this issue, we employ additional skip connections between the corresponding pair of encoder-decoder layers. The assumption is that, in a mirrored autoencoder topology, a decoder layer reconstructs the intermediate feature representation of its corresponding encoder layer. Hence, any additional information directly propagated from the corresponding encoder layer helps the reconstruction. We implement this kind of skip connections in the form of additional autoencoders, each of which is a small codec that compresses the massive data transfer between the paired encoder-decoder layers. We empirically verify that the proposed hyper-autoencoded architecture improves perceptual audio quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Image and Signal Denoising Methods · Acoustic Wave Phenomena Research
