Ensembling object detectors for image and video data analysis

Kateryna Chumachenko; Jenni Raitoharju; Alexandros Iosifidis; Moncef; Gabbouj

arXiv:2102.04798·cs.CV·February 10, 2021

Ensembling object detectors for image and video data analysis

Kateryna Chumachenko, Jenni Raitoharju, Alexandros Iosifidis, Moncef, Gabbouj

PDF

TL;DR

This paper introduces an ensembling method for object detectors that enhances detection accuracy and bounding box precision in images and videos, with applications in annotation and tracking.

Contribution

It presents a novel ensembling approach for image and video object detection, including a tracking-based refinement scheme for videos.

Findings

01

Improved detection performance through ensembling.

02

Enhanced bounding box precision in images and videos.

03

Effective as a standalone detection or annotation framework.

Abstract

In this paper, we propose a method for ensembling the outputs of multiple object detectors for improving detection performance and precision of bounding boxes on image data. We further extend it to video data by proposing a two-stage tracking-based scheme for detection refinement. The proposed method can be used as a standalone approach for improving object detection performance, or as a part of a framework for faster bounding box annotation in unseen datasets, assuming that the objects of interest are those present in some common public datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.