Beyond Cross-Validation: Adaptive Parameter Selection for Kernel-Based Gradient Descents
Xiaotong Liu, Yunwen Lei, Xiangyu Chang, and Shao-Bo Lin

TL;DR
This paper introduces an adaptive parameter selection method for kernel-based gradient descent algorithms, leveraging bias-variance analysis and empirical effective dimension to achieve optimal generalization and adaptability.
Contribution
It presents a novel, implementable adaptive parameter selection strategy for KGD, supported by theoretical analysis and the integral operator approach.
Findings
Achieves optimal generalization error bounds.
Adapts effectively to various kernels and target functions.
Outperforms existing parameter selection methods.
Abstract
This paper proposes a novel parameter selection strategy for kernel-based gradient descent (KGD) algorithms, integrating bias-variance analysis with the splitting method. We introduce the concept of empirical effective dimension to quantify iteration increments in KGD, deriving an adaptive parameter selection strategy that is implementable. Theoretical verifications are provided within the framework of learning theory. Utilizing the recently developed integral operator approach, we rigorously demonstrate that KGD, equipped with the proposed adaptive parameter selection strategy, achieves the optimal generalization error bound and adapts effectively to different kernels, target functions, and error metrics. Consequently, this strategy showcases significant advantages over existing parameter selection methods for KGD.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Face and Expression Recognition · Machine Learning and ELM
