Descriptor transition tables for object retrieval using unconstrained cluttered video acquired using a consumer level handheld mobile device
Warren Rieutort-Louis, Ognjen Arandjelovic

TL;DR
This paper introduces a novel object recognition method from unconstrained video sequences using descriptor transition tables, enabling retrieval from large, cluttered databases acquired with consumer mobile devices.
Contribution
The paper proposes a new recognition approach that models viewpoint changes with descriptor transition tables and Markov chains, suitable for mobile device video retrieval in cluttered environments.
Findings
Effective recognition despite background clutter
Handles large viewpoint variations
Validated on challenging mobile-acquired video dataset
Abstract
Visual recognition and vision based retrieval of objects from large databases are tasks with a wide spectrum of potential applications. In this paper we propose a novel recognition method from video sequences suitable for retrieval from databases acquired in highly unconstrained conditions e.g. using a mobile consumer-level device such as a phone. On the lowest level, we represent each sequence as a 3D mesh of densely packed local appearance descriptors. While image plane geometry is captured implicitly by a large overlap of neighbouring regions from which the descriptors are extracted, 3D information is extracted by means of a descriptor transition table, learnt from a single sequence for each known gallery object. These allow us to connect local descriptors along the 3rd dimension (which corresponds to viewpoint changes), thus resulting in a set of variable length Markov chains for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
