Antagonising explanation and revealing bias directly through sequencing   and multimodal inference

Lu\'is Arandas; Mick Grierson; Miguel Carvalhais

arXiv:2309.12345·cs.HC·September 25, 2023·1 cites

Antagonising explanation and revealing bias directly through sequencing and multimodal inference

Lu\'is Arandas, Mick Grierson, Miguel Carvalhais

PDF

Open Access 1 Datasets

TL;DR

This paper explores how deep generative models, especially diffusion models, reflect cultural biases through their reconstruction process, and proposes viewing their outputs as a temporal dialogue with the past, impacting future audiovisual synthesis.

Contribution

It introduces a novel perspective on generative models as a process of cultural and temporal reflection, emphasizing the importance of understanding bias and history in image and video synthesis.

Findings

01

Generative models encode cultural biases in their reconstruction process.

02

Viewing diffusion as a backward-in-time process reveals limitations in current synthesis methods.

03

Historical methodologies can inform and improve modern generative approaches.

Abstract

Deep generative models produce data according to a learned representation, e.g. diffusion models, through a process of approximation computing possible samples. Approximation can be understood as reconstruction and the large datasets used to train models as sets of records in which we represent the physical world with some data structure (photographs, audio recordings, manuscripts). During the process of reconstruction, e.g., image frames develop each timestep towards a textual input description. While moving forward in time, frame sets are shaped according to learned bias and their production, we argue here, can be considered as going back in time; not by inspiration on the backward diffusion process but acknowledging culture is specifically marked in the records. Futures of generative modelling, namely in film and audiovisual arts, can benefit by dealing with diffusion systems as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

bpathir1/RefEdit
dataset· 65 dl
65 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing

MethodsDiffusion