# Particle Filtering for PLCA model with Application to Music   Transcription

**Authors:** D. Cazau, G. Revillon, W. Yuancheng, O. Adam

arXiv: 1703.09772 · 2017-04-07

## TL;DR

This paper introduces a particle filtering approach to improve probabilistic latent component analysis for automatic music transcription, enhancing robustness and flexibility over traditional EM-based methods.

## Contribution

It proposes a particle filtering framework for PLCA, allowing better parameter estimation and integration of prior knowledge in music transcription tasks.

## Key findings

- Achieved 61.8% and 59.5% note-level accuracy on different datasets.
- Demonstrated improved robustness over EM-based PLCA methods.
- Provided a flexible framework for future development in AMT.

## Abstract

Automatic Music Transcription (AMT) consists in automatically estimating the notes in an audio recording, through three attributes: onset time, duration and pitch. Probabilistic Latent Component Analysis (PLCA) has become very popular for this task. PLCA is a spectrogram factorization method, able to model a magnitude spectrogram as a linear combination of spectral vectors from a dictionary. Such methods use the Expectation-Maximization (EM) algorithm to estimate the parameters of the acoustic model. This algorithm presents well-known inherent defaults (local convergence, initialization dependency), making EM-based systems limited in their applications to AMT, particularly in regards to the mathematical form and number of priors. To overcome such limits, we propose in this paper to employ a different estimation framework based on Particle Filtering (PF), which consists in sampling the posterior distribution over larger parameter ranges. This framework proves to be more robust in parameter estimation, more flexible and unifying in the integration of prior knowledge in the system. Note-level transcription accuracies of 61.8 $\%$ and 59.5 $\%$ were achieved on evaluation sound datasets of two different instrument repertoires, including the classical piano (from MAPS dataset) and the marovany zither, and direct comparisons to previous PLCA-based approaches are provided. Steps for further development are also outlined.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.09772/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1703.09772/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/1703.09772/full.md

---
Source: https://tomesphere.com/paper/1703.09772