Tiering as a Stochastic Submodular Optimization Problem
Hyokun Yun, Michael Froh, Roshan Makhijani, Brian Luc, Alex Smola,, Trishul Chilimbi

TL;DR
This paper formulates the problem of document tiering in information retrieval as a stochastic submodular optimization task, aiming to improve generalization to future traffic through novel algorithms.
Contribution
It introduces a stochastic optimization framework for tiering, connecting it to submodular optimization with a knapsack constraint, and develops efficient algorithms for this problem.
Findings
Formulates tiering as a stochastic submodular optimization problem.
Develops algorithms leveraging submodular properties for efficient optimization.
Demonstrates improved generalization performance over static methods.
Abstract
Tiering is an essential technique for building large-scale information retrieval systems. While the selection of documents for high priority tiers critically impacts the efficiency of tiering, past work focuses on optimizing it with respect to a static set of queries in the history, and generalizes poorly to the future traffic. Instead, we formulate the optimal tiering as a stochastic optimization problem, and follow the methodology of regularized empirical risk minimization to maximize the \emph{generalization performance} of the system. We also show that the optimization problem can be cast as a stochastic submodular optimization problem with a submodular knapsack constraint, and we develop efficient optimization algorithms by leveraging this connection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Algorithms and Data Compression · Optimization and Search Problems
