Point Cloud Audio Processing

Krishna Subramani; Paris Smaragdis

arXiv:2105.02469·eess.AS·March 28, 2022

Point Cloud Audio Processing

Krishna Subramani, Paris Smaragdis

PDF

1 Repo

TL;DR

This paper introduces a point cloud-based approach to audio processing that achieves invariance to input representation parameters, enabling flexible, smaller models with minimal performance loss across different sampling rates and representations.

Contribution

The authors propose a novel point cloud method for audio processing that is invariant to representation choices and allows for effective subsampling, unlike traditional fixed-dimensional models.

Findings

01

Models are smaller and more efficient.

02

Performance remains stable despite subsampling.

03

Invariance to DFT size and sampling rate.

Abstract

Most audio processing pipelines involve transformations that act on fixed-dimensional input representations of audio. For example, when using the Short Time Fourier Transform (STFT) the DFT size specifies a fixed dimension for the input representation. As a consequence, most audio machine learning models are designed to process fixed-size vector inputs which often prohibits the repurposing of learned models on audio with different sampling rates or alternative representations. We note, however, that the intrinsic spectral information in the audio signal is invariant to the choice of the input representation or the sampling rate. Motivated by this, we introduce a novel way of processing audio signals by treating them as a collection of points in feature space, and we use point cloud machine learning models that give us invariance to the choice of representation parameters, such as DFT…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SubramaniKrishna/point-cloud-audio
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.