Component-based Sketching for Deep ReLU Nets

Di Wang; Shao-Bo Lin; Deyu Meng; Feilong Cao

arXiv:2409.14174·cs.LG·September 24, 2024

Component-based Sketching for Deep ReLU Nets

Di Wang, Shao-Bo Lin, Deyu Meng, Feilong Cao

PDF

Open Access

TL;DR

This paper introduces a novel component-based sketching method for deep ReLU networks that improves generalization and reduces training costs by transforming training into a linear risk minimization problem.

Contribution

It develops a new sketching scheme based on deep net components, enabling linearization of training and providing near-optimal approximation and generalization guarantees.

Findings

01

Achieves almost optimal approximation rates for shallow nets.

02

Provides near-optimal generalization error bounds.

03

Demonstrates superior performance and reduced training costs in experiments.

Abstract

Deep learning has made profound impacts in the domains of data mining and AI, distinguished by the groundbreaking achievements in numerous real-world applications and the innovative algorithm design philosophy. However, it suffers from the inconsistency issue between optimization and generalization, as achieving good generalization, guided by the bias-variance trade-off principle, favors under-parameterized networks, whereas ensuring effective convergence of gradient-based algorithms demands over-parameterized networks. To address this issue, we develop a novel sketching scheme based on deep net components for various tasks. Specifically, we use deep net components with specific efficacy to build a sketching basis that embodies the advantages of deep networks. Subsequently, we transform deep net training into a linear empirical risk minimization problem based on the constructed basis,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Embedded Systems Design Techniques