HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable   Neural Audio Coding

Darius Petermann; Seungkwon Beack; Minje Kim

arXiv:2107.10843·eess.AS·July 26, 2021

HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding

Darius Petermann, Seungkwon Beack, Minje Kim

PDF

Open Access

TL;DR

HARP-Net introduces hyper-autoencoded skip connections in neural audio codecs, enhancing information flow and perceptual quality by using small autoencoders to facilitate data transfer between encoder-decoder layers.

Contribution

The paper proposes a novel hyper-autoencoded architecture with skip connections that improve neural audio coding performance over traditional autoencoders.

Findings

01

Improved perceptual audio quality compared to baseline autoencoders

02

Effective data transfer via small autoencoders between layers

03

Enhanced information flow in neural audio codecs

Abstract

An autoencoder-based codec employs quantization to turn its bottleneck layer activation into bitstrings, a process that hinders information flow between the encoder and decoder parts. To circumvent this issue, we employ additional skip connections between the corresponding pair of encoder-decoder layers. The assumption is that, in a mirrored autoencoder topology, a decoder layer reconstructs the intermediate feature representation of its corresponding encoder layer. Hence, any additional information directly propagated from the corresponding encoder layer helps the reconstruction. We implement this kind of skip connections in the form of additional autoencoders, each of which is a small codec that compresses the massive data transfer between the paired encoder-decoder layers. We empirically verify that the proposed hyper-autoencoded architecture improves perceptual audio quality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Image and Signal Denoising Methods · Acoustic Wave Phenomena Research