Self-Augmented Multi-Modal Feature Embedding

Shinnosuke Matsuo; Seiichi Uchida; Brian Kenji Iwana

arXiv:2103.04731·cs.CV·March 9, 2021·1 cites

Self-Augmented Multi-Modal Feature Embedding

Shinnosuke Matsuo, Seiichi Uchida, Brian Kenji Iwana

PDF

Open Access

TL;DR

This paper introduces a self-augmentation technique combined with multi-modal feature embedding to improve classification by leveraging complementary information from different data modalities, demonstrated on handwriting and leaf image datasets.

Contribution

It proposes a novel self-augmented multi-modal embedding method that creates shared feature spaces for different data modalities, enhancing classification performance.

Findings

01

Effective embeddings achieved on handwriting and leaf image datasets

02

Improved classification accuracy over baseline methods

03

Demonstrates the benefit of combining self-augmentation with multi-modal data

Abstract

Oftentimes, patterns can be represented through different modalities. For example, leaf data can be in the form of images or contours. Handwritten characters can also be either online or offline. To exploit this fact, we propose the use of self-augmentation and combine it with multi-modal feature embedding. In order to take advantage of the complementary information from the different modalities, the self-augmented multi-modal feature embedding employs a shared feature space. Through experimental results on classification with online handwriting and leaf images, we demonstrate that the proposed method can create effective embeddings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Music and Audio Processing · Image Retrieval and Classification Techniques