Importance Sampling Optimization with Laplace Principle
Radu-Alexandru Dragomir (S2A, IDS), Fran\c{c}ois Portier (ENSAI, CREST), Victor Priser (S2A, IDS)

TL;DR
This paper introduces an importance sampling-based refinement for hyperparameter tuning that improves over traditional grid and random search by averaging configurations, with theoretical and practical advantages.
Contribution
It proposes a Laplace principle-inspired importance sampling scheme for hyperparameter optimization, enhancing existing search methods without extra evaluations.
Findings
Error rate after n evaluations is smaller than n - 2/(d+2) in non-convex settings.
The method outperforms random and grid search rates when dimension d > 2.
Practical benefits demonstrated on several examples.
Abstract
Grid search and random search are widely used techniques for hyperparameter tuning in machine learning, especially when gradient information is unavailable. In these methods, a finite set of candidate configurations is evaluated, and the best-performing one is selected. We propose a simple and computationally inexpensive refinement of this paradigm: instead of selecting a single best point, we form a weighted average of the evaluated configurations, where the weights are chosen using an importance sampling scheme inspired by the Laplace principle. This scheme can be implemented as a post-processing step on top of a random search, with no additional function evaluations. We also propose an iterative variant, where the sampling distributions are chosen adaptively to generate new candidate points around the previous estimate, in the spirit of Evolution Strategy (ES) methods. In a general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
