A Multi-Plane Block-Coordinate Frank-Wolfe Algorithm for Training Structural SVMs with a Costly max-Oracle
Neel Shah, Vladimir Kolmogorov, Christoph H. Lampert

TL;DR
This paper introduces a new training algorithm for structural SVMs that reduces computational costs by intelligently combining stochastic block-coordinate methods with hyperplane caching, leading to faster convergence especially when the max-oracle is expensive.
Contribution
It proposes a novel algorithm combining stochastic block-coordinate Frank-Wolfe with hyperplane caching and an adaptive oracle call strategy for efficient SSVM training.
Findings
Faster convergence with fewer oracle calls.
Reduced total runtime when the max-oracle is slow.
Effective in computer vision tasks with costly max-oracles.
Abstract
Structural support vector machines (SSVMs) are amongst the best performing models for structured computer vision tasks, such as semantic image segmentation or human pose estimation. Training SSVMs, however, is computationally costly, because it requires repeated calls to a structured prediction subroutine (called \emph{max-oracle}), which has to solve an optimization problem itself, e.g. a graph cut. In this work, we introduce a new algorithm for SSVM training that is more efficient than earlier techniques when the max-oracle is computationally expensive, as it is frequently the case in computer vision tasks. The main idea is to (i) combine the recent stochastic Block-Coordinate Frank-Wolfe algorithm with efficient hyperplane caching, and (ii) use an automatic selection rule for deciding whether to call the exact max-oracle or to rely on an approximate one based on the cached…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
