TMac: Temporal Multi-Modal Graph Learning for Acoustic Event   Classification

Meng Liu; Ke Liang; Dayu Hu; Hao Yu; Yue Liu; Lingyuan Meng; Wenxuan; Tu; Sihang Zhou; Xinwang Liu

arXiv:2309.11845·cs.SD·September 27, 2023

TMac: Temporal Multi-Modal Graph Learning for Acoustic Event Classification

Meng Liu, Ke Liang, Dayu Hu, Hao Yu, Yue Liu, Lingyuan Meng, Wenxuan, Tu, Sihang Zhou, Xinwang Liu

PDF

1 Repo

TL;DR

TMac introduces a graph-based approach to model temporal relationships in multi-modal audiovisual data, significantly improving acoustic event classification by capturing dynamic intra- and inter-modal information.

Contribution

The paper presents a novel graph learning method that explicitly models temporal relations in multi-modal data for acoustic event classification, outperforming existing methods.

Findings

01

TMac achieves superior performance over state-of-the-art models.

02

Explicit temporal modeling enhances multi-modal acoustic event classification.

03

The approach effectively captures intra- and inter-modal temporal dynamics.

Abstract

Audiovisual data is everywhere in this digital age, which raises higher requirements for the deep learning models developed on them. To well handle the information of the multi-modal data is the key to a better audiovisual modal. We observe that these audiovisual data naturally have temporal attributes, such as the time information for each frame in the video. More concretely, such data is inherently multi-modal according to both audio and visual cues, which proceed in a strict chronological order. It indicates that temporal information is important in multi-modal acoustic event modeling for both intra- and inter-modal. However, existing methods deal with each modal feature independently and simply fuse them together, which neglects the mining of temporal relation and thus leads to sub-optimal performance. With this motivation, we propose a Temporal Multi-modal graph learning method for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mgithubl/tmac
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.