Near-optimal learning with average H\"older smoothness
Steve Hanneke, Aryeh Kontorovich, Guy Kornowski

TL;DR
This paper introduces a generalized measure of average H"older smoothness that adapts to the underlying data distribution, leading to improved risk bounds and nearly optimal learning algorithms in regression tasks.
Contribution
It extends average Lipschitz smoothness to H"older smoothness, providing tight risk bounds and new algorithms that leverage the average smoothness for better learning guarantees.
Findings
Risk bounds improve upon previous results.
Lower bounds establish minimax optimality.
Algorithms achieve near-optimal rates under average smoothness.
Abstract
We generalize the notion of average Lipschitz smoothness proposed by Ashlagi et al. (COLT 2021) by extending it to H\"older smoothness. This measure of the "effective smoothness" of a function is sensitive to the underlying distribution and can be dramatically smaller than its classic "worst-case" H\"older constant. We consider both the realizable and the agnostic (noisy) regression settings, proving upper and lower risk bounds in terms of the average H\"older smoothness; these rates improve upon both previously known rates even in the special case of average Lipschitz smoothness. Moreover, our lower bound is tight in the realizable setting up to log factors, thus we establish the minimax rate. From an algorithmic perspective, since our notion of average smoothness is defined with respect to the unknown underlying distribution, the learner does not have an explicit representation of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Statistical Methods and Inference
