TAPS : Frustratingly Simple Test Time Active Learning for VLMs
Dhruv Sarkar, Aprameyo Chakrabartty, Bibhudatta Bhanja

TL;DR
This paper introduces TAPS, a simple yet effective test-time active learning framework for vision-language models that adaptively queries data and updates prompts in real-time streaming scenarios, improving performance with minimal latency.
Contribution
It presents a novel TTAL framework that operates in real-time, with dynamic entropy thresholds, class-balanced memory, and distribution alignment, addressing practical constraints in streaming environments.
Findings
Consistent performance improvements over state-of-the-art methods.
Effective in real-world safety-critical applications.
Maintains low latency and memory overhead.
Abstract
Test-Time Optimization enables models to adapt to new data during inference by updating parameters on-the-fly. Recent advances in Vision-Language Models (VLMs) have explored learning prompts at test time to improve performance in downstream tasks. In this work, we extend this idea by addressing a more general and practical challenge: Can we effectively utilize an oracle in a continuous data stream where only one sample is available at a time, requiring an immediate query decision while respecting latency and memory constraints? To tackle this, we propose a novel Test-Time Active Learning (TTAL) framework that adaptively queries uncertain samples and updates prompts dynamically. Unlike prior methods that assume batched data or multiple gradient updates, our approach operates in a real-time streaming scenario with a single test sample per step. We introduce a dynamically adjusted entropy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
