A Mixture of Expert Approach for Low-Cost Customization of Deep Neural   Networks

Boyu Zhang; Azadeh Davoodi; and Yu-Hen Hu

arXiv:1811.00056·cs.CV·November 2, 2018·1 cites

A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks

Boyu Zhang, Azadeh Davoodi, and Yu-Hen Hu

PDF

Open Access

TL;DR

This paper introduces a Mixture of Experts architecture combining a global and local expert with a gating network to enable low-cost, privacy-preserving customization of deep neural networks on edge devices, demonstrated on handwritten character recognition.

Contribution

The paper proposes a novel MOE architecture that allows efficient local customization of DNNs with minimal retraining and privacy protection, specifically applied to handwritten character recognition.

Findings

01

Significant accuracy improvement with local experts on customized data

02

Minimal overhead (around 2.5%) in energy and network size

03

Effective personalization without degrading generic model performance

Abstract

The ability to customize a trained Deep Neural Network (DNN) locally using user-specific data may greatly enhance user experiences, reduce development costs, and protect user's privacy. In this work, we propose to incorporate a novel Mixture of Experts (MOE) approach to accomplish this goal. This architecture comprises of a Global Expert (GE), a Local Expert (LE) and a Gating Network (GN). The GE is a trained DNN developed on a large training dataset representative of many potential users. After deployment on an embedded edge device, GE will be subject to customized, user-specific data (e.g., accent in speech) and its performance may suffer. This problem may be alleviated by training a local DNN (the local expert, LE) on a small size customized training data to correct the errors made by GE. A gating network then will be trained to determine whether an incoming data should be handled by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Privacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning