Structured Visual Search via Composition-aware Learning
Mert Kilickaya, Arnold W.M. Smeulders

TL;DR
This paper introduces a novel structured visual search method that leverages continuous relationships in object composition through symmetry and equivariance, resulting in more efficient and accurate search performance on large-scale benchmarks.
Contribution
The work proposes a symmetry-based learning approach that explicitly models continuous relationships in structured visual queries, improving search efficiency and accuracy.
Findings
Significant performance gains on MS-COCO and HICO-DET benchmarks.
Efficient learning from fewer data due to smaller feature space.
Enhanced sensitivity to input transformations through equivariance.
Abstract
This paper studies visual search using structured queries. The structure is in the form of a 2D composition that encodes the position and the category of the objects. The transformation of the position and the category of the objects leads to a continuous-valued relationship between visual compositions, which carries highly beneficial information, although not leveraged by previous techniques. To that end, in this work, our goal is to leverage these continuous relationships by using the notion of symmetry in equivariance. Our model output is trained to change symmetrically with respect to the input transformations, leading to a sensitive feature space. Doing so leads to a highly efficient search technique, as our approach learns from fewer data using a smaller feature space. Experiments on two large-scale benchmarks of MS-COCO and HICO-DET demonstrates that our approach leads to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
