T-Rex: Counting by Visual Prompting

Qing Jiang; Feng Li; Tianhe Ren; Shilong Liu; Zhaoyang Zeng; Kent Yu,; Lei Zhang

arXiv:2311.13596·cs.CV·November 23, 2023·5 cites

T-Rex: Counting by Visual Prompting

Qing Jiang, Feng Li, Tianhe Ren, Shilong Liu, Zhaoyang Zeng, Kent Yu,, Lei Zhang

PDF

Open Access 1 Models 2 Datasets

TL;DR

T-Rex is an interactive, open-set object counting model that uses visual prompts for detection and counting, achieving state-of-the-art results and demonstrating strong zero-shot capabilities across diverse scenarios.

Contribution

The paper introduces T-Rex, a novel interactive object counting framework that integrates visual prompts for open-set detection and counting, with new benchmarks and practical applications.

Findings

01

Achieves state-of-the-art performance on class-agnostic counting benchmarks.

02

Demonstrates exceptional zero-shot counting capabilities.

03

Effective in diverse real-world scenarios.

Abstract

We introduce T-Rex, an interactive object counting model designed to first detect and then count any objects. We formulate object counting as an open-set object detection task with the integration of visual prompts. Users can specify the objects of interest by marking points or boxes on a reference image, and T-Rex then detects all objects with a similar pattern. Guided by the visual feedback from T-Rex, users can also interactively refine the counting results by prompting on missing or falsely-detected objects. T-Rex has achieved state-of-the-art performance on several class-agnostic counting benchmarks. To further exploit its potential, we established a new counting benchmark encompassing diverse scenarios and challenges. Both quantitative and qualitative results show that T-Rex possesses exceptional zero-shot counting capabilities. We also present various practical application…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
IDEA-Research/Rex-Omni
model· 27k dl· ♡ 55
27k dl♡ 55

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Visual Attention and Saliency Detection