Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Yongtao Wu, Grigorios G Chrysos, Volkan Cevher

TL;DR
This paper introduces complex-valued polynomial networks called APOLLO for audio synthesis, leveraging the complex nature of time-frequency representations to improve performance and model richer correlations.
Contribution
The paper presents a novel complex-valued polynomial network architecture that naturally models complex TF representations and demonstrates superior audio generation performance.
Findings
17.5% improvement over adversarial methods
8.2% improvement over diffusion models on SC09
Effective modeling of high-order correlations in audio synthesis
Abstract
Time-frequency (TF) representations in audio synthesis have been increasingly modeled with real-valued networks. However, overlooking the complex-valued nature of TF representations can result in suboptimal performance and require additional modules (e.g., for modeling the phase). To this end, we introduce complex-valued polynomial networks, called APOLLO, that integrate such complex-valued representations in a natural way. Concretely, APOLLO captures high-order correlations of the input elements using high-order tensors as scaling parameters. By leveraging standard tensor decompositions, we derive different architectures and enable modeling richer correlations. We outline such architectures and showcase their performance in audio generation across four benchmarks. As a highlight, APOLLO results in improvement over adversarial methods and over the state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Acoustic Wave Phenomena Research · Music Technology and Sound Studies
MethodsDiffusion · Adaptive Parameter-wise Diagonal Quasi-Newton Method
