Network Bending of Diffusion Models for Audio-Visual Generation

Luke Dzwonczyk; Carmine Emanuele Cella; David Ban

arXiv:2406.19589·cs.SD·July 1, 2024

Network Bending of Diffusion Models for Audio-Visual Generation

Luke Dzwonczyk, Carmine Emanuele Cella, David Ban

PDF

Open Access 1 Repo

TL;DR

This paper explores network bending in diffusion models to enable artists to create music visualizations with fine-grain control and novel visual effects, including music-reactive videos.

Contribution

It introduces the application of network bending to diffusion models for creative visual effects and music-reactive video generation, expanding the toolset for artistic image manipulation.

Findings

01

Network bending produces unique visual effects not easily replicated with standard tools.

02

It enables continuous, fine-grain control over image generation.

03

Music-reactive videos can be generated by passing audio features as parameters.

Abstract

In this paper we present the first steps towards the creation of a tool which enables artists to create music visualizations using pre-trained, generative, machine learning models. First, we investigate the application of network bending, the process of applying transforms within the layers of a generative network, to image generation diffusion models by utilizing a range of point-wise, tensor-wise, and morphological operators. We identify a number of visual effects that result from various operators, including some that are not easily recreated with standard image editing tools. We find that this process allows for continuous, fine-grain control of image generation which can be helpful for creative applications. Next, we generate music-reactive videos using Stable Diffusion by passing audio features as parameters to network bending operators. Finally, we comment on certain transforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dzluke/DAFX2024
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic Technology and Sound Studies

MethodsDiffusion