Mel-Spectrogram Inversion via Alternating Direction Method of   Multipliers

Yoshiki Masuyama; Natsuki Ueno; Nobutaka Ono

arXiv:2501.05557·eess.AS·January 14, 2025

Mel-Spectrogram Inversion via Alternating Direction Method of Multipliers

Yoshiki Masuyama, Natsuki Ueno, Nobutaka Ono

PDF

Open Access

TL;DR

This paper introduces an ADMM-based optimization approach for mel-spectrogram inversion, improving the joint estimation of magnitude and phase to enhance signal reconstruction quality in speech and sound synthesis.

Contribution

The paper proposes a novel ADMM-based joint estimation method for mel-spectrogram inversion that outperforms existing iterative approaches in efficiency and accuracy.

Findings

01

Effective reconstruction of speech and sounds demonstrated.

02

Outperforms previous methods in accuracy and convergence speed.

03

Joint estimation reduces error accumulation.

Abstract

Signal reconstruction from its mel-spectrogram is known as mel-spectrogram inversion and has many applications, including speech and foley sound synthesis. In this paper, we propose a mel-spectrogram inversion method based on a rigorous optimization algorithm. To reconstruct a time-domain signal with inverse short-time Fourier transform (STFT), both full-band STFT magnitude and phase should be predicted from a given mel-spectrogram. Their joint estimation has outperformed the cascaded full-band magnitude prediction and phase reconstruction by preventing error accumulation. However, the existing joint estimation method requires many iterations, and there remains room for performance improvement. We present an alternating direction method of multipliers (ADMM)-based joint estimation method motivated by its success in various nonconvex optimization problems including phase reconstruction.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhotonic and Optical Devices · Neural Networks and Applications · Semiconductor Lasers and Optical Devices