Learning Combinations of Sigmoids Through Gradient Estimation
Stratis Ioannidis, Andrea Montanari

TL;DR
This paper introduces a novel gradient estimation and clustering method to learn parameters of regression models with hidden units, specifically focusing on linear combinations of sigmoids, with proven guarantees.
Contribution
It presents a new approach for learning sigmoid combinations by estimating and clustering gradients, providing the first non-asymptotic guarantees for such models.
Findings
Gradients concentrate around hidden unit parameters
Non-asymptotic bounds on sample complexity
Method successfully identifies sigmoid parameters
Abstract
We develop a new approach to learn the parameters of regression models with hidden variables. In a nutshell, we estimate the gradient of the regression function at a set of random points, and cluster the estimated gradients. The centers of the clusters are used as estimates for the parameters of hidden units. We justify this approach by studying a toy model, whereby the regression function is a linear combination of sigmoids. We prove that indeed the estimated gradients concentrate around the parameter vectors of the hidden units, and provide non-asymptotic bounds on the number of required samples. To the best of our knowledge, no comparable guarantees have been proven for linear combinations of sigmoids.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Bayesian Methods and Mixture Models · Statistical Methods and Inference
