Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision
Hao Dong, Eleni Chatzi, Olga Fink

TL;DR
This paper introduces a novel self-supervised learning approach for multimodal open-set domain generalization and adaptation, leveraging new pretext tasks and an entropy weighting mechanism to improve recognition of unseen classes across domains.
Contribution
It presents the first multimodal open-set domain generalization method using self-supervision, with innovative pretext tasks and an entropy-based loss balancing technique.
Findings
Effective in recognizing novel classes across unseen domains
Improves multimodal feature learning through self-supervised pretext tasks
Demonstrates versatility on multiple datasets and settings
Abstract
The task of open-set domain generalization (OSDG) involves recognizing novel classes within unseen domains, which becomes more challenging with multiple modalities as input. Existing works have only addressed unimodal OSDG within the meta-learning framework, without considering multimodal scenarios. In this work, we introduce a novel approach to address Multimodal Open-Set Domain Generalization (MM-OSDG) for the first time, utilizing self-supervision. To this end, we introduce two innovative multimodal self-supervised pretext tasks: Masked Cross-modal Translation and Multimodal Jigsaw Puzzles. These tasks facilitate the learning of multimodal representative features, thereby enhancing generalization and open-class detection capabilities. Additionally, we propose a novel entropy weighting mechanism to balance the loss across different modalities. Furthermore, we extend our approach to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Text and Document Classification Technologies
MethodsJigsaw
