Audio-guided Album Cover Art Generation with Genetic Algorithms
James Marien, Sam Leroux, Bart Dhoedt, Cedric De Boom

TL;DR
This paper presents a flexible deep-learning framework that uses genetic algorithms to generate album cover art guided by audio features, addressing challenges in creative design automation.
Contribution
It introduces a novel audio-guided cover art generation method leveraging VQGAN-CLIP and genetic algorithms, adaptable without retraining.
Findings
Framework generates suitable cover art for various genres
Visual features adapt to changes in audio features
Genetic algorithms help overcome local minima and adversarial issues
Abstract
Over 60,000 songs are released on Spotify every day, and the competition for the listener's attention is immense. In that regard, the importance of captivating and inviting cover art cannot be underestimated, because it is deeply entangled with a song's character and the artist's identity, and remains one of the most important gateways to lead people to discover music. However, designing cover art is a highly creative, lengthy and sometimes expensive process that can be daunting, especially for non-professional artists. For this reason, we propose a novel deep-learning framework to generate cover art guided by audio features. Inspired by VQGAN-CLIP, our approach is highly flexible because individual components can easily be replaced without the need for any retraining. This paper outlines the architectural details of our models and discusses the optimization challenges that emerge from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing
