Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet Convolutional Network
Maysam Behmanesh, Peyman Adibi, Mohammad Saeed Ehsani, Jocelyn, Chanussot

TL;DR
This paper introduces M-GWCN, a novel multimodal deep learning model that leverages multi-scaled graph wavelet transforms to capture intra- and cross-modality information on geometric structures, enhancing node classification performance.
Contribution
The paper proposes a new end-to-end multimodal graph neural network that effectively models intra- and cross-modality relationships without prior correspondence knowledge.
Findings
Outperforms existing spectral graph CNNs and multimodal methods.
Effective on both unimodal and multimodal graph datasets.
Demonstrates superior semi-supervised node classification accuracy.
Abstract
Multimodal data provide complementary information of a natural phenomenon by integrating data from various domains with very different statistical properties. Capturing the intra-modality and cross-modality information of multimodal data is the essential capability of multimodal learning methods. The geometry-aware data analysis approaches provide these capabilities by implicitly representing data in various modalities based on their geometric underlying structures. Also, in many applications, data are explicitly defined on an intrinsic geometric structure. Generalizing deep learning methods to the non-Euclidean domains is an emerging research field, which has recently been investigated in many studies. Most of those popular methods are developed for unimodal data. In this paper, a multimodal multi-scaled graph wavelet convolutional network (M-GWCN) is proposed as an end-to-end network.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
