Generalized Operating Procedure for Deep Learning: an Unconstrained   Optimal Design Perspective

Shen Chen; Mingwei Zhang; Jiamin Cui; Wei Yao

arXiv:2012.15391·cs.LG·January 1, 2021

Generalized Operating Procedure for Deep Learning: an Unconstrained Optimal Design Perspective

Shen Chen, Mingwei Zhang, Jiamin Cui, Wei Yao

PDF

Open Access

TL;DR

This paper introduces a seven-step generalized operating procedure for deep learning, aimed at simplifying its application for practitioners, and demonstrates its effectiveness through a multi-stream speaker verification system that outperforms baseline methods.

Contribution

The paper proposes a structured, unconstrained optimal design-based procedure for deep learning and applies it to develop a robust multi-stream speaker verification system.

Findings

01

The proposed procedure simplifies deep learning implementation.

02

Multi-stream framework reduces decision cost by 20%.

03

Experimental results confirm the effectiveness of the procedure.

Abstract

Deep learning (DL) has brought about remarkable breakthrough in processing images, video and speech due to its efficacy in extracting highly abstract representation and learning very complex functions. However, there is seldom operating procedure reported on how to make it for real use cases. In this paper, we intend to address this problem by presenting a generalized operating procedure for DL from the perspective of unconstrained optimal design, which is motivated by a simple intension to remove the barrier of using DL, especially for those scientists or engineers who are new but eager to use it. Our proposed procedure contains seven steps, which are project/problem statement, data collection, architecture design, initialization of parameters, defining loss function, computing optimal parameters, and inference, respectively. Following this procedure, we build a multi-stream end-to-end…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing