Long-Form Text-to-Music Generation with Adaptive Prompts: A Case Study in Tabletop Role-Playing Games Soundtracks

Felipe Marra; Lucas N. Ferreira

arXiv:2411.03948·cs.SD·May 23, 2025

Long-Form Text-to-Music Generation with Adaptive Prompts: A Case Study in Tabletop Role-Playing Games Soundtracks

Felipe Marra, Lucas N. Ferreira

PDF

Open Access 1 Repo

TL;DR

This study explores long-form text-to-music generation for TRPG soundtracks, introducing Babel Bardo which uses LLMs to improve music description quality and transition smoothness in dynamic storytelling contexts.

Contribution

We present Babel Bardo, a novel system integrating LLMs with text-to-music models for adaptive, long-form soundtrack generation in TRPGs, demonstrating improved audio quality and story coherence.

Findings

01

Detailed music descriptions enhance audio quality.

02

Consistent descriptions improve story alignment.

03

Adaptive prompts enable smoother transitions.

Abstract

This paper investigates the capabilities of text-to-audio music generation models in producing long-form music with prompts that change over time, focusing on soundtrack generation for Tabletop Role-Playing Games (TRPGs). We introduce Babel Bardo, a system that uses Large Language Models (LLMs) to transform speech transcriptions into music descriptions for controlling a text-to-music model. Four versions of Babel Bardo were compared in two TRPG campaigns: a baseline using direct speech transcriptions, and three LLM-based versions with varying approaches to music description generation. Evaluations considered audio quality, story alignment, and transition smoothness. Results indicate that detailed music descriptions improve audio quality while maintaining consistency across consecutive descriptions enhances story alignment and transition smoothness.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

felipemarra/babel-bardo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Artificial Intelligence in Games