Multi-Lingual DALL-E Storytime

Noga Mudrik; Adam S. Charles

arXiv:2212.11985·cs.CL·December 26, 2022

Multi-Lingual DALL-E Storytime

Noga Mudrik, Adam S. Charles

PDF

Open Access 2 Repos

TL;DR

This paper introduces an automatic storytelling framework that enhances DALL-E's ability to generate coherent, multi-frame visual stories from non-English texts, addressing language bias and storytelling limitations.

Contribution

The authors develop a framework that enables DALL-E to create coherent visual stories from non-English sources, overcoming language and sequential storytelling constraints.

Findings

01

Effective visualization of non-English stories and songs.

02

Ability to generate coherent, multi-frame narratives.

03

User constraints can be incorporated for customized storytelling.

Abstract

While recent advancements in artificial intelligence (AI) language models demonstrate cutting-edge performance when working with English texts, equivalent models do not exist in other languages or do not reach the same performance level. This undesired effect of AI advancements increases the gap between access to new technology from different populations across the world. This unsought bias mainly discriminates against individuals whose English skills are less developed, e.g., non-English speakers children. Following significant advancements in AI research in recent years, OpenAI has recently presented DALL-E: a powerful tool for creating images based on English text prompts. While DALL-E is a promising tool for many applications, its decreased performance when given input in a different language, limits its audience and deepens the gap between populations. An additional limitation of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques