# ATDD: Multi-lingual dataset for auto-tune detection in music recordings

**Authors:** Mahyar Gohari, Paolo Bestagini, Sergio Benini, Nicola Adami

PMC · DOI: 10.1016/j.dib.2025.112446 · Data in Brief · 2026-01-07

## TL;DR

This paper introduces a multilingual dataset to help detect auto-tuned music in recordings across English, Mandarin, and Japanese.

## Contribution

The paper presents a novel multilingual dataset with auto-tuned and authentic music segments, annotated for pitch correction.

## Key findings

- The dataset includes 10-second audio segments in English, Mandarin, and Japanese with pitch correction applied.
- Time-domain labels are provided to indicate exact pitch correction locations for training detection models.
- The dataset is structured for robust analysis and supports research in music analysis and audio signal processing.

## Abstract

This study introduces a novel multilingual dataset designed to distinguish auto-tuned musical compositions from authentic recordings, addressing a significant gap in existing resources. The dataset encompasses songs in English, Mandarin, and Japanese, ensuring a diverse representation of linguistic contexts. The data collection process began with aggregating diverse datasets from the Music Information Retrieval domain, incorporating tracks from the three specified languages to capture a wide range of musical styles and recording qualities. Each audio file was subsequently standardized into 10-second intervals with the sample rate of 16 kHz to facilitate manageable analysis. For the creation of auto-tuned samples, pitch correction was implemented using the probabilistic YIN (PYIN) algorithm for accurate pitch detection, followed by transposition via the pitch-synchronized overlap and add (PSOLA) technique. To emulate realistic auto-tuning scenarios, pitch correction was randomly applied to portions of each 10-second segment, ensuring variability and realism in the dataset, which makes it suitable for training robust detection models. Additionally, time-domain labels indicating the exact locations of pitch correction within each segment were generated, providing precise annotations crucial for developing accurate detection algorithms. The resulting multilingual dataset comprises a comprehensive collection of both auto-tuned and authentic musical segments across English, Mandarin, and Japanese languages, each annotated with detailed information about pitch correction applications. This rich annotation allows for nuanced analysis and supports various research applications, while the dataset's structure and thorough documentation of its creation process make it a valuable resource for researchers in music analysis, machine learning, and audio signal processing.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12856623/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12856623/full.md

## References

12 references — full list in the complete paper: https://tomesphere.com/paper/PMC12856623/full.md

---
Source: https://tomesphere.com/paper/PMC12856623