PAGURI: a user experience study of creative interaction with text-to-music models
Francesca Ronchini, Luca Comanducci, Gabriele Perego, Fabio Antonacci

TL;DR
This study explores how musicians interact with text-to-music models, revealing their creative potential and limitations, and provides insights for future improvements and integration into music practices.
Contribution
It introduces PAGURI, an online tool and user research methodology to evaluate musician interactions with text-to-music models and their personalization features.
Findings
Participants see potential in text-to-music models for creativity.
Personalization features are appreciated by users.
Challenges include prompt ambiguity and limited control.
Abstract
In recent years, text-to-music models have been the biggest breakthrough in automatic music generation. While they are unquestionably a showcase of technological progress, it is not clear yet how they can be realistically integrated into the artistic practice of musicians and music practitioners. This paper aims to address this question via Prompt Audio Generation User Research Investigation (PAGURI), a user experience study where we leverage recent text-to-music developments to study how musicians and practitioners interact with these systems, evaluating their satisfaction levels. We developed an online tool through which users can generate music samples and/or apply recently proposed personalization techniques based on fine-tuning to allow the text-to-music model to generate sounds closer to their needs and preferences. Using semi-structured interviews, we analyzed different aspects…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing
