MoE-SPNet: A Mixture-of-Experts Scene Parsing Network
Huan Fu, Mingming Gong, Chaohui Wang, and Dacheng Tao

TL;DR
This paper introduces MoE-SPNet, a novel scene parsing network that uses a mixture-of-experts approach to adaptively weight multi-scale features spatially, improving scene understanding accuracy.
Contribution
It proposes a mixture-of-experts layer and an adaptive hierarchical feature aggregation mechanism for better multi-scale feature utilization in scene parsing.
Findings
Improved accuracy on PASCAL VOC 2012 dataset.
Enhanced performance on SceneParse150 dataset.
Effective feature weighting improves scene parsing results.
Abstract
Scene parsing is an indispensable component in understanding the semantics within a scene. Traditional methods rely on handcrafted local features and probabilistic graphical models to incorporate local and global cues. Recently, methods based on fully convolutional neural networks have achieved new records on scene parsing. An important strategy common to these methods is the aggregation of hierarchical features yielded by a deep convolutional neural network. However, typical algorithms usually aggregate hierarchical convolutional features via concatenation or linear combination, which cannot sufficiently exploit the diversities of contextual information in multi-scale features and the spatial inhomogeneity of a scene. In this paper, we propose a mixture-of-experts scene parsing network (MoE-SPNet) that incorporates a convolutional mixture-of-experts layer to assess the importance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
