Depthwise Convolution is All You Need for Learning Multiple Visual Domains
Yunhui Guo, Yandong Li, Rogerio Feris, Liqiang Wang, Tajana Rosing

TL;DR
This paper introduces a multi-domain learning model using depthwise separable convolution that efficiently captures shared structures across visual domains, achieving high performance with fewer parameters.
Contribution
The paper proposes a novel multi-domain learning architecture based on depthwise separable convolution with a gating mechanism for soft sharing, reducing parameters and improving performance.
Findings
Achieves highest score on Visual Decathlon Challenge
Uses only 50% of parameters compared to state-of-the-art
Effectively captures shared cross-domain features
Abstract
There is a growing interest in designing models that can deal with images from different visual domains. If there exists a universal structure in different visual domains that can be captured via a common parameterization, then we can use a single model for all domains rather than one model per domain. A model aware of the relationships between different domains can also be trained to work on new domains with less resources. However, to identify the reusable structure in a model is not easy. In this paper, we propose a multi-domain learning architecture based on depthwise separable convolution. The proposed approach is based on the assumption that images from different domains share cross-channel correlations but have domain-specific spatial correlations. The proposed model is compact and has minimal overhead when being applied to new domains. Additionally, we introduce a gating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
