SORCE: Small Object Retrieval in Complex Environments

Chunxu Liu; Chi Xie; Xiaxu Chen; Wei Li; Feng Zhu; Rui Zhao; Limin Wang

arXiv:2505.24441·cs.CV·June 2, 2025

SORCE: Small Object Retrieval in Complex Environments

Chunxu Liu, Chi Xie, Xiaxu Chen, Wei Li, Feng Zhu, Rui Zhao, Limin Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces SORCE, a new benchmark and approach for small object retrieval in complex images, demonstrating that multi-embedding representations significantly improve retrieval performance over existing methods.

Contribution

The paper proposes a novel multi-embedding method using MLLMs and Regional Prompts for small object retrieval, along with a new benchmark dataset SORCE-1K.

Findings

01

Existing T2IR methods struggle with small objects in complex environments.

02

Multi-embedding representations outperform single-embedding approaches.

03

The proposed method achieves significant improvements on SORCE-1K.

Abstract

Text-to-Image Retrieval (T2IR) is a highly valuable task that aims to match a given textual query to images in a gallery. Existing benchmarks primarily focus on textual queries describing overall image semantics or foreground salient objects, possibly overlooking inconspicuous small objects, especially in complex environments. Such small object retrieval is crucial, as in real-world applications, the targets of interest are not always prominent in the image. Thus, we introduce SORCE (Small Object Retrieval in Complex Environments), a new subfield of T2IR, focusing on retrieving small objects in complex images with textual queries. We propose a new benchmark, SORCE-1K, consisting of images with complex environments and textual queries describing less conspicuous small objects with minimal contextual cues from other salient objects. Preliminary analysis on SORCE-1K finds that existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mcg-nju/sorce
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Data Mining and Analysis · Image Processing and 3D Reconstruction

MethodsFocus · Sparse Evolutionary Training