On a time-frequency blurring operator with applications in data augmentation
Simon Halvdansson

TL;DR
This paper introduces a novel time-frequency blurring operator for data augmentation in audio signal classification, demonstrating its effectiveness in improving model performance, especially with limited data, through analysis and neural network experiments.
Contribution
The paper proposes a new time-frequency blurring operator and analyzes its mathematical properties, applying it to enhance data augmentation for audio classification tasks.
Findings
The operator improves classification accuracy in data-scarce scenarios.
Spectrogram augmentation with the operator outperforms traditional methods.
Theoretical analysis confirms the operator's boundedness and positivity.
Abstract
Inspired by the success of recent data augmentation methods for signals which act on time-frequency representations, we introduce an operator which convolves the short-time Fourier transform of a signal with a specified kernel. Analytical properties including boundedness, compactness and positivity are investigated from the perspective of time-frequency analysis. A convolutional neural network and a vision transformer are trained to classify audio signals using spectrograms with different augmentation setups, including the above mentioned time-frequency blurring operator, with results indicating that the operator can significantly improve test performance, especially in the data-starved regime.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Mathematical Analysis and Transform Methods · Seismic Imaging and Inversion Techniques
