Generalized Operating Procedure for Deep Learning: an Unconstrained Optimal Design Perspective
Shen Chen, Mingwei Zhang, Jiamin Cui, Wei Yao

TL;DR
This paper introduces a seven-step generalized operating procedure for deep learning, aimed at simplifying its application for practitioners, and demonstrates its effectiveness through a multi-stream speaker verification system that outperforms baseline methods.
Contribution
The paper proposes a structured, unconstrained optimal design-based procedure for deep learning and applies it to develop a robust multi-stream speaker verification system.
Findings
The proposed procedure simplifies deep learning implementation.
Multi-stream framework reduces decision cost by 20%.
Experimental results confirm the effectiveness of the procedure.
Abstract
Deep learning (DL) has brought about remarkable breakthrough in processing images, video and speech due to its efficacy in extracting highly abstract representation and learning very complex functions. However, there is seldom operating procedure reported on how to make it for real use cases. In this paper, we intend to address this problem by presenting a generalized operating procedure for DL from the perspective of unconstrained optimal design, which is motivated by a simple intension to remove the barrier of using DL, especially for those scientists or engineers who are new but eager to use it. Our proposed procedure contains seven steps, which are project/problem statement, data collection, architecture design, initialization of parameters, defining loss function, computing optimal parameters, and inference, respectively. Following this procedure, we build a multi-stream end-to-end…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
