Self-Augmented Multi-Modal Feature Embedding
Shinnosuke Matsuo, Seiichi Uchida, Brian Kenji Iwana

TL;DR
This paper introduces a self-augmentation technique combined with multi-modal feature embedding to improve classification by leveraging complementary information from different data modalities, demonstrated on handwriting and leaf image datasets.
Contribution
It proposes a novel self-augmented multi-modal embedding method that creates shared feature spaces for different data modalities, enhancing classification performance.
Findings
Effective embeddings achieved on handwriting and leaf image datasets
Improved classification accuracy over baseline methods
Demonstrates the benefit of combining self-augmentation with multi-modal data
Abstract
Oftentimes, patterns can be represented through different modalities. For example, leaf data can be in the form of images or contours. Handwritten characters can also be either online or offline. To exploit this fact, we propose the use of self-augmentation and combine it with multi-modal feature embedding. In order to take advantage of the complementary information from the different modalities, the self-augmented multi-modal feature embedding employs a shared feature space. Through experimental results on classification with online handwriting and leaf images, we demonstrate that the proposed method can create effective embeddings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Music and Audio Processing · Image Retrieval and Classification Techniques
