Latent Code-Based Fusion: A Volterra Neural Network Approach

Sally Ghanem; Siddharth Roheda; and Hamid Krim

arXiv:2104.04829·cs.CV·April 13, 2021

Latent Code-Based Fusion: A Volterra Neural Network Approach

Sally Ghanem, Siddharth Roheda, and Hamid Krim

PDF

Open Access

TL;DR

This paper introduces a Volterra Neural Network-based deep encoder for multi-modal data fusion, demonstrating improved clustering, sample efficiency, and robustness over traditional CNN auto-encoders.

Contribution

It presents a novel VNN-based auto-encoder architecture that reduces parameter complexity and enhances multi-modal data fusion capabilities.

Findings

01

Significant improvement in clustering performance over CNN auto-encoders

02

Enhanced sample efficiency compared to CNN-based auto-encoders

03

Robust classification performance across datasets

Abstract

We propose a deep structure encoder using the recently introduced Volterra Neural Networks (VNNs) to seek a latent representation of multi-modal data whose features are jointly captured by a union of subspaces. The so-called self-representation embedding of the latent codes leads to a simplified fusion which is driven by a similarly constructed decoding. The Volterra Filter architecture achieved reduction in parameter complexity is primarily due to controlled non-linearities being introduced by the higher-order convolutions in contrast to generalized activation functions. Experimental results on two different datasets have shown a significant improvement in the clustering performance for VNNs auto-encoder over conventional Convolutional Neural Networks (CNNs) auto-encoder. In addition, we also show that the proposed approach demonstrates a much-improved sample complexity over CNN-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Neural Networks and Applications · Animal Vocal Communication and Behavior