Bootstrapping MLLM for Weakly-Supervised Class-Agnostic Object Counting
Xiaowen Zhang, Zijie Yue, Yong Luo, Cairong Zhao, Qijun Chen, and Miaojing Shi

TL;DR
This paper introduces WS-COC, a novel weakly-supervised, multi-label learning framework for class-agnostic object counting that leverages dialogue tuning, compare-and-rank optimization, and multi-scale counting to outperform fully-supervised methods.
Contribution
The paper presents the first MLLM-driven weakly-supervised framework for class-agnostic object counting, introducing three strategies to improve counting without detailed annotations.
Findings
WS-COC matches or surpasses state-of-the-art fully-supervised methods.
Significantly reduces annotation costs in object counting tasks.
Effective in dense scenes with improved local and global count aggregation.
Abstract
Object counting is a fundamental task in computer vision, with broad applicability in many real-world scenarios. Fully-supervised counting methods require costly point-level annotations per object. Few weakly-supervised methods leverage only image-level object counts as supervision and achieve fairly promising results. They are, however, often limited to counting a single category, e.g. person. In this paper, we propose WS-COC, the first MLLM-driven weakly-supervised framework for class-agnostic object counting. Instead of directly fine-tuning MLLMs to predict object counts, which can be challenging due to the modality gap, we incorporate three simple yet effective strategies to bootstrap the counting paradigm in both training and testing: First, a divide-and-discern dialogue tuning strategy is proposed to guide the MLLM to determine whether the object count falls within a specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Water Quality Monitoring Technologies · Fire Detection and Safety Systems
