From Words to Worlds: Transforming One-line Prompt into Immersive Multi-modal Digital Stories with Communicative LLM Agent
Samuel S. Sohn, Danrui Li, Sen Zhang, Che-Jui Chang and, Mubbasir Kapadia

TL;DR
This paper introduces the StoryAgent framework that leverages Large Language Models and generative tools to automate and enhance the creation of immersive, multi-modal digital stories from simple prompts, improving scalability and engagement.
Contribution
The paper presents a novel framework combining top-down story drafting with bottom-up asset generation, enabling automated, consistent, and interactive digital storytelling across multiple modalities.
Findings
Able to produce coherent digital stories without reference videos
Automates story creation, reducing manual intervention
Enhances engagement through multi-modal content
Abstract
Digital storytelling, essential in entertainment, education, and marketing, faces challenges in production scalability and flexibility. The StoryAgent framework, introduced in this paper, utilizes Large Language Models and generative tools to automate and refine digital storytelling. Employing a top-down story drafting and bottom-up asset generation approach, StoryAgent tackles key issues such as manual intervention, interactive scene orchestration, and narrative consistency. This framework enables efficient production of interactive and consistent narratives across multiple modalities, democratizing content creation and enhancing engagement. Our results demonstrate the framework's capability to produce coherent digital stories without reference videos, marking a significant advancement in automated digital storytelling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Artificial Intelligence in Games
