Convolutional Networks as Extremely Small Foundation Models: Visual Prompting and Theoretical Perspective
Jianqiao Wangni

TL;DR
This paper introduces a simple, theoretically grounded prompting module called SDForest that adapts generic deep networks for new tasks like video object segmentation, achieving competitive results with low computational cost.
Contribution
The paper proposes a novel, theoretically motivated prompting module, SDForest, combining nonparametric methods with deep networks for effective few-shot adaptation.
Findings
SDForest achieves real-time performance on CPU.
It attains competitive results on DAVIS2016 and DAVIS2017 datasets.
The approach has lower complexity and better generalization properties.
Abstract
Comparing to deep neural networks trained for specific tasks, those foundational deep networks trained on generic datasets such as ImageNet classification, benefits from larger-scale datasets, simpler network structure and easier training techniques. In this paper, we design a prompting module which performs few-shot adaptation of generic deep networks to new tasks. Driven by learning theory, we derive prompting modules that are as simple as possible, as they generalize better under the same training error. We use a case study on video object segmentation to experiment. We give a concrete prompting module, the Semi-parametric Deep Forest (SDForest) that combines several nonparametric methods such as correlation filter, random forest, image-guided filter, with a deep network trained for ImageNet classification task. From a learning-theoretical point of view, all these models are of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Data Visualization and Analytics
