Audio Inpainting in Time-Frequency Domain with Phase-Aware Prior
Peter Balu\v{s}\'ik, Pavel Rajmic

TL;DR
This paper introduces a phase-aware inpainting method for audio spectrograms that improves reconstruction quality and computational efficiency by leveraging instantaneous frequency estimates and an optimization algorithm.
Contribution
It presents a novel phase-aware prior for time-frequency audio inpainting, outperforming existing neural and autoregressive methods in quality and speed.
Findings
Outperforms deep-prior neural network and Janssen-TF in objective metrics
Achieves superior subjective listening test results
Reduces computational cost significantly
Abstract
We address the problem of time-frequency audio inpainting, where the goal is to fill missing spectrogram portions with reliable information. Despite recent advances, existing approaches still face limitations in both reconstruction quality and computational efficiency. To bridge this gap, we propose a method that utilizes a phase-aware signal prior which exploits estimates of the instantaneous frequency. An optimization problem is formulated and solved using the generalized Chambolle-Pock algorithm. The proposed method is evaluated against other time-frequency inpainting methods, specifically a deep-prior audio inpainting neural network and the autoregression-based approach known as Janssen-TF. Our proposed approach surpassed these methods by a large margin in the objective evaluation as well as in the conducted subjective listening test, improving the state of the art. In addition, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Hearing Loss and Rehabilitation
