Learning Dynamic Hierarchical Models for Anytime Scene Labeling
Buyu Liu, Xuming He

TL;DR
This paper introduces a dynamic hierarchical model for anytime scene labeling that balances efficiency and accuracy by adapting to test-time budgets, optimizing feature computation and inference costs.
Contribution
It presents a novel approach that formulates anytime scene parsing as a Markov Decision Process and learns adaptive policies for feature and model selection.
Findings
Achieves 90% of state-of-the-art accuracy with only 15% of the computational cost.
Demonstrates effectiveness on three semantic segmentation datasets.
Provides a flexible framework for cost-aware scene labeling.
Abstract
With increasing demand for efficient image and video analysis, test-time cost of scene parsing becomes critical for many large-scale or time-sensitive vision applications. We propose a dynamic hierarchical model for anytime scene labeling that allows us to achieve flexible trade-offs between efficiency and accuracy in pixel-level prediction. In particular, our approach incorporates the cost of feature computation and model inference, and optimizes the model performance for any given test-time budget by learning a sequence of image-adaptive hierarchical models. We formulate this anytime representation learning as a Markov Decision Process with a discrete-continuous state-action space. A high-quality policy of feature and model selection is learned based on an approximate policy iteration method with action proposal mechanism. We demonstrate the advantages of our dynamic non-myopic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Advanced Neural Network Applications
