OptiRoulette Optimizer: A New Stochastic Meta-Optimizer for up to 5.3x Faster Convergence
Stamatis Mastromichalakis

TL;DR
OptiRoulette is a novel stochastic meta-optimizer that dynamically selects update rules during training, significantly improving convergence speed and reliability across multiple image classification benchmarks.
Contribution
It introduces a new meta-optimizer that combines optimizer pooling, learning-rate scaling, and failure-aware replacement, implemented as a drop-in component compatible with PyTorch.
Findings
Achieves up to 9.74 percentage points higher accuracy than AdamW baseline.
Ensures convergence to high accuracy targets in all runs, unlike baseline.
Reduces time-to-target significantly on several datasets.
Abstract
This paper presents OptiRoulette, a stochastic meta-optimizer that selects update rules during training instead of fixing a single optimizer. The method combines warmup optimizer locking, random sampling from an active optimizer pool, compatibility-aware learning-rate scaling during optimizer transitions, and failure-aware pool replacement. OptiRoulette is implemented as a drop-in, "torch.optim.Optimizer-compatible" component and packaged for pip installation. We report completed 10-seed results on five image-classification suites: CIFAR-100, CIFAR-100-C, SVHN, Tiny ImageNet, and Caltech-256. Against a single-optimizer AdamW baseline, OptiRoulette improves mean test accuracy from 0.6734 to 0.7656 on CIFAR-100 (+9.22 percentage points), 0.2904 to 0.3355 on CIFAR-100-C (+4.52), 0.9667 to 0.9756 on SVHN (+0.89), 0.5669 to 0.6642 on Tiny ImageNet (+9.73), and 0.5946 to 0.6920 on Caltech-256…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Stochastic Gradient Optimization Techniques
