Fast and accurate object detection in high resolution 4K and 8K video   using GPUs

V\'it R\r{u}\v{z}i\v{c}ka; Franz Franchetti

arXiv:1810.10551·cs.CV·February 1, 2019

Fast and accurate object detection in high resolution 4K and 8K video using GPUs

V\'it R\r{u}\v{z}i\v{c}ka, Franz Franchetti

PDF

1 Repo

TL;DR

This paper introduces a GPU-accelerated attention pipeline method that efficiently detects objects in high-resolution 4K and 8K videos by using a two-stage evaluation process with YOLO v2, balancing speed and accuracy.

Contribution

The paper presents a novel two-stage attention pipeline that reduces computation for high-resolution video object detection using GPUs, maintaining accuracy while significantly improving speed.

Findings

01

Achieves 3-6 fps on 4K videos

02

Achieves 2 fps on 8K videos

03

Maintains high detection accuracy

Abstract

Machine learning has celebrated a lot of achievements on computer vision tasks such as object detection, but the traditionally used models work with relatively low resolution images. The resolution of recording devices is gradually increasing and there is a rising need for new methods of processing high resolution data. We propose an attention pipeline method which uses two staged evaluation of each image or video frame under rough and refined resolution to limit the total number of necessary evaluations. For both stages, we make use of the fast object detection model YOLO v2. We have implemented our model in code, which distributes the work across GPUs. We maintain high accuracy while reaching the average performance of 3-6 fps on 4K video and 2 fps on 8K video.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

previtus/AttentionPipeline
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.