Component-based Sketching for Deep ReLU Nets
Di Wang, Shao-Bo Lin, Deyu Meng, Feilong Cao

TL;DR
This paper introduces a novel component-based sketching method for deep ReLU networks that improves generalization and reduces training costs by transforming training into a linear risk minimization problem.
Contribution
It develops a new sketching scheme based on deep net components, enabling linearization of training and providing near-optimal approximation and generalization guarantees.
Findings
Achieves almost optimal approximation rates for shallow nets.
Provides near-optimal generalization error bounds.
Demonstrates superior performance and reduced training costs in experiments.
Abstract
Deep learning has made profound impacts in the domains of data mining and AI, distinguished by the groundbreaking achievements in numerous real-world applications and the innovative algorithm design philosophy. However, it suffers from the inconsistency issue between optimization and generalization, as achieving good generalization, guided by the bias-variance trade-off principle, favors under-parameterized networks, whereas ensuring effective convergence of gradient-based algorithms demands over-parameterized networks. To address this issue, we develop a novel sketching scheme based on deep net components for various tasks. Specifically, we use deep net components with specific efficacy to build a sketching basis that embodies the advantages of deep networks. Subsequently, we transform deep net training into a linear empirical risk minimization problem based on the constructed basis,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Embedded Systems Design Techniques
