Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet   Convolutional Network

Maysam Behmanesh; Peyman Adibi; Mohammad Saeed Ehsani; Jocelyn; Chanussot

arXiv:2111.13361·cs.LG·November 29, 2021

Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet Convolutional Network

Maysam Behmanesh, Peyman Adibi, Mohammad Saeed Ehsani, Jocelyn, Chanussot

PDF

TL;DR

This paper introduces M-GWCN, a novel multimodal deep learning model that leverages multi-scaled graph wavelet transforms to capture intra- and cross-modality information on geometric structures, enhancing node classification performance.

Contribution

The paper proposes a new end-to-end multimodal graph neural network that effectively models intra- and cross-modality relationships without prior correspondence knowledge.

Findings

01

Outperforms existing spectral graph CNNs and multimodal methods.

02

Effective on both unimodal and multimodal graph datasets.

03

Demonstrates superior semi-supervised node classification accuracy.

Abstract

Multimodal data provide complementary information of a natural phenomenon by integrating data from various domains with very different statistical properties. Capturing the intra-modality and cross-modality information of multimodal data is the essential capability of multimodal learning methods. The geometry-aware data analysis approaches provide these capabilities by implicitly representing data in various modalities based on their geometric underlying structures. Also, in many applications, data are explicitly defined on an intrinsic geometric structure. Generalizing deep learning methods to the non-Euclidean domains is an emerging research field, which has recently been investigated in many studies. Most of those popular methods are developed for unimodal data. In this paper, a multimodal multi-scaled graph wavelet convolutional network (M-GWCN) is proposed as an end-to-end network.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.