Class Similarity-Based Multimodal Classification under Heterogeneous Category Sets
Yangrui Zhu, Junhua Bao, Yipan Wei, Yapeng Li, Bo Du

TL;DR
This paper introduces a novel multimodal classification framework that handles heterogeneous category sets across modalities, leveraging class similarity and uncertainty estimation to improve recognition accuracy in real-world scenarios.
Contribution
The paper proposes the CSCF model for multi-modal heterogeneous category-set learning, enabling effective cross-modal knowledge transfer and decision fusion.
Findings
Significantly outperforms existing methods on benchmark datasets.
Effectively recognizes complete class sets across modalities.
Addresses real-world heterogeneity in multimodal data.
Abstract
Existing multimodal methods typically assume that different modalities share the same category set. However, in real-world applications, the category distributions in multimodal data exhibit inconsistencies, which can hinder the model's ability to effectively utilize cross-modal information for recognizing all categories. In this work, we propose the practical setting termed Multi-Modal Heterogeneous Category-set Learning (MMHCL), where models are trained in heterogeneous category sets of multi-modal data and aim to recognize complete classes set of all modalities during test. To effectively address this task, we propose a Class Similarity-based Cross-modal Fusion model (CSCF). Specifically, CSCF aligns modality-specific features to a shared semantic space to enable knowledge transfer between seen and unseen classes. It then selects the most discriminative modality for decision fusion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Text and Document Classification Technologies
MethodsSparse Evolutionary Training
