Learning with Spectral Kernels and Heavy-Tailed Data
Michael W. Mahoney, Hariharan Narayanan

TL;DR
This paper develops distribution-dependent learning methods for binary classification on heavy-tailed data using spectral kernels, providing dimension-independent sample complexity bounds and extending to Banach space norms.
Contribution
It introduces new bounds on sample complexity for maximum margin classifiers with heavy-tailed features and diffusion kernels, generalizing to Banach spaces.
Findings
Bounded annealed entropy for gap-tolerant classifiers in Hilbert spaces.
Sample complexity bounds for heavy-tailed feature vectors.
Extension of margin-based analysis to Banach spaces.
Abstract
Two ubiquitous aspects of large-scale data analysis are that the data often have heavy-tailed properties and that diffusion-based or spectral-based methods are often used to identify and extract structure of interest. Perhaps surprisingly, popular distribution-independent methods such as those based on the VC dimension fail to provide nontrivial results for even simple learning problems such as binary classification in these two settings. In this paper, we develop distribution-dependent learning methods that can be used to provide dimension-independent sample complexity bounds for the binary classification problem in these two popular settings. In particular, we provide bounds on the sample complexity of maximum margin classifiers when the magnitude of the entries in the feature vector decays according to a power law and also when learning is performed with the so-called Diffusion Maps…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Neural Networks and Applications · Anomaly Detection Techniques and Applications
