Joint Analysis of Acoustic Scenes and Sound Events Based on Multitask   Learning with Dynamic Weight Adaptation

Kayo Nada; Keisuke Imoto; Takao Tsuchiya

arXiv:2206.10349·cs.SD·June 22, 2022·1 cites

Joint Analysis of Acoustic Scenes and Sound Events Based on Multitask Learning with Dynamic Weight Adaptation

Kayo Nada, Keisuke Imoto, Takao Tsuchiya

PDF

Open Access

TL;DR

This paper introduces dynamic weight adaptation methods for multitask learning models that jointly analyze acoustic scenes and sound events, improving performance by automatically balancing task losses during training.

Contribution

It proposes novel dynamic weight adaptation techniques based on dynamic weight average and multi-focal loss for joint ASC and SED analysis in MTL models.

Findings

01

Improved scene classification accuracy

02

Enhanced sound event detection performance

03

Dynamic weights adapt effectively during training

Abstract

Acoustic scene classification (ASC) and sound event detection (SED) are major topics in environmental sound analysis. Considering that acoustic scenes and sound events are closely related to each other, the joint analysis of acoustic scenes and sound events using multitask learning (MTL)-based neural networks was proposed in some previous works. Conventional methods train MTL-based models using a linear combination of ASC and SED loss functions with constant weights. However, the performance of conventional MTL-based methods depends strongly on the weights of the ASC and SED losses, and it is difficult to determine the appropriate balance between the constant weights of the losses of MTL of ASC and SED. In this paper, we thus propose dynamic weight adaptation methods for MTL of ASC and SED based on dynamic weight average and multi--focal loss to adjust the learning weights…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Underwater Acoustics Research