An adaptive music generation architecture for games based on the deep learning Transformer mode
Gustavo Amaral Costa dos Santos, Augusto Baffa, Jean-Pierre, Briot, Bruno Feij\'o, Antonio Luz Furtado

TL;DR
This paper introduces an adaptive music generation system for video games utilizing Transformer deep learning models, allowing customization based on player preferences and emotional context, with layered music synthesis.
Contribution
It presents a novel Transformer-based architecture that adapts game music in real-time according to player taste and emotional states, integrating layered composition strategies.
Findings
Generates customizable game music using Transformer models.
Incorporates emotional modeling for adaptive music selection.
Supports layered musical structures for dynamic composition.
Abstract
This paper presents an architecture for generating music for video games based on the Transformer deep learning model. Our motivation is to be able to customize the generation according to the taste of the player, who can select a corpus of training examples, corresponding to his preferred musical style. The system generates various musical layers, following the standard layering strategy currently used by composers designing video game music. To adapt the music generated to the game play and to the player(s) situation, we are using an arousal-valence model of emotions, in order to control the selection of musical layers. We discuss current limitations and prospects for the future, such as collaborative and interactive control of the musical components.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Human Motion and Animation
MethodsAttention Is All You Need · Linear Layer · Softmax · Multi-Head Attention · Residual Connection · Byte Pair Encoding · Dense Connections · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing
