A Distributional View of High Dimensional Optimization
Felix Benning

TL;DR
This thesis explores a distributional approach to high-dimensional optimization, offering new insights into Bayesian optimization, gradient descent, and the role of data assumptions in machine learning landscapes.
Contribution
It introduces a distributional framework for optimization, providing mathematical tools and insights into high-dimensional problems, Bayesian methods, and data-driven objective functions.
Findings
Distributional view explains progress in high-dimensional optimization
Insights into optimal step size control for gradient descent
Analysis of random objective functions in machine learning
Abstract
This PhD thesis presents a distributional view of optimization in place of a worst-case perspective. We motivate this view with an investigation of the failure point of classical optimization. Subsequently we consider the optimization of a randomly drawn objective function. This is the setting of Bayesian Optimization. After a review of Bayesian optimization we outline how such a distributional view may explain predictable progress of optimization in high dimension. It further turns out that this distributional view provides insights into optimal step size control of gradient descent. To enable these results, we develop mathematical tools to deal with random input to random functions and a characterization of non-stationary isotropic covariance kernels. Finally, we outline how assumptions about the data, specifically exchangability, can lead to random objective functions in machine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research
