TL;DR
This paper introduces a reinforcement learning approach to optimize post-processing parameters for audio event detection models, significantly improving classification accuracy by automating parameter tuning.
Contribution
It presents a novel reinforcement learning framework for jointly optimizing post-processing parameters in audio event classification, outperforming manual tuning methods.
Findings
Achieved a 4-5% increase in macro F1-score on DCASE challenge models.
Demonstrated the effectiveness of RL in automating post-processing parameter selection.
Improved audio event detection accuracy through learned parameter optimization.
Abstract
We apply post-processing to the class probability distribution outputs of audio event classification models and employ reinforcement learning to jointly discover the optimal parameters for various stages of a post-processing stack, such as the classification thresholds and the kernel sizes of median filtering algorithms used to smooth out model predictions. To achieve this we define a reinforcement learning environment where: 1) a state is the class probability distribution provided by the model for a given audio sample, 2) an action is the choice of a candidate optimal value for each parameter of the post-processing stack, 3) the reward is based on the classification accuracy metric we aim to optimize, which is the audio event-based macro F1-score in our case. We apply our post-processing to the class probability distribution outputs of two audio event classification models submitted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
