Explicit and Non-asymptotic Query Complexities of Rank-Based Zeroth-order Algorithm on Stochastic Smooth Functions
Haishan Ye

TL;DR
This paper introduces a simple, efficient rank-based zeroth-order optimization algorithm for stochastic smooth functions, providing explicit non-asymptotic query complexity bounds that match value-based methods, thus establishing the effectiveness of ordinal feedback.
Contribution
The paper presents the first non-asymptotic analysis of rank-based ZO algorithms under stochastic smoothness, matching the query complexity of value-based methods.
Findings
Query complexity bounds match the best-known value-based algorithms.
Ordinal feedback suffices for optimal query efficiency in stochastic optimization.
New analytical tools are developed beyond existing drift and information-geometric techniques.
Abstract
Zeroth-order (ZO) optimization with ordinal feedback has emerged as a fundamental problem in modern machine learning systems, particularly in human-in-the-loop settings such as reinforcement learning from human feedback, preference learning, and evolutionary strategies. While rank-based ZO algorithms enjoy strong empirical success and robustness properties, their theoretical understanding, especially under stochastic objectives and standard smoothness assumptions, remains limited. In this paper, we study rank-based zeroth-order optimization for stochastic functions where only ordinal feedback of the stochastic function is available. We propose a simple and computationally efficient rank-based ZO algorithm. Under standard assumptions including smoothness, strong convexity, and bounded second moments of stochastic gradients, we establish explicit non-asymptotic query complexity bounds for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Age of Information Optimization · Reinforcement Learning in Robotics
