LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance
Ioannis Prokopiou, Ioannis Sina, Agisilaos Kounelis, Pantelis Vikatos, Themos Stafylakis

TL;DR
LabelBuddy is an open-source, AI-assisted audio annotation tool that enables collaborative, customizable, and human-aligned music and audio tagging, addressing infrastructure gaps in MIR.
Contribution
It introduces a flexible, containerized system for AI-assisted audio annotation that supports multi-user collaboration and custom model integration.
Findings
Supports human-AI collaborative annotation
Decouples interface from inference for flexibility
Enables multi-user consensus and model customization
Abstract
The advancement of Machine learning (ML), Large Audio Language Models (LALMs), and autonomous AI agents in Music Information Retrieval (MIR) necessitates a shift from static tagging to rich, human-aligned representation learning. However, the scarcity of open-source infrastructure capable of capturing the subjective nuances of audio annotation remains a critical bottleneck. This paper introduces \textbf{LabelBuddy}, an open-source collaborative auto-tagging audio annotation tool designed to bridge the gap between human intent and machine understanding. Unlike static tools, it decouples the interface from inference via containerized backends, allowing users to plug in custom models for AI-assisted pre-annotation. We describe the system architecture, which supports multi-user consensus, containerized model isolation, and a roadmap for extending agents and LALMs. Code available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMusic and Audio Processing · AI in Service Interactions · Topic Modeling
