Semantic and Semiotic Interplays in Text-to-Audio AI: Exploring Cognitive Dynamics and Musical Interactions
Guilherme Coelho

TL;DR
This paper explores how text-to-audio AI systems transform musical creation and perception by analyzing semantic and semiotic interactions, cognitive dynamics, and the potential for new listening modes and cultural insights.
Contribution
It introduces a theoretical framework combining structuralist, semiotic, and cognitive perspectives to understand AI-mediated musical signification and cognition.
Findings
AI models act as quasi-objects of musical signification
They stabilize and destabilize traditional musical forms
They foster new modes of listening and aesthetic reflexivity
Abstract
This paper investigates the emerging text-to-audio paradigm in artificial intelligence (AI), examining its transformative implications for musical creation, interpretation, and cognition. I explore the complex semantic and semiotic interplays that occur when descriptive natural language prompts are translated into nuanced sound objects across the text-to-audio modality. Drawing from structuralist and post-structuralist perspectives, as well as cognitive theories of schema dynamics and metacognition, the paper explores how these AI systems reconfigure musical signification processes and navigate established cognitive frameworks. The research analyzes some of the cognitive dynamics at play in AI-mediated musicking, including processes of schema assimilation and accommodation, metacognitive reflection, and constructive perception. The paper argues that text-to-audio AI models function as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeuroscience and Music Perception · Music Technology and Sound Studies · Sound Studies and Aurality
