Physical Representation-based Predicate Optimization for a Visual   Analytics Database

Michael R. Anderson; Michael Cafarella; German Ros; Thomas F. Wenisch

arXiv:1806.04226·cs.DB·April 23, 2019

Physical Representation-based Predicate Optimization for a Visual Analytics Database

Michael R. Anderson, Michael Cafarella, German Ros, Thomas F. Wenisch

PDF

TL;DR

This paper introduces Tahoma, a method that optimizes both CNN architectures and input data representations, significantly accelerating visual content queries with minimal accuracy loss.

Contribution

Tahoma jointly optimizes CNN models and input transformations, leading to substantial speedups in visual content classification without compromising accuracy.

Findings

01

Up to 35x speedup from input transformations

02

Up to 98x speedup over ResNet50 with no accuracy loss

03

280x speedup with some accuracy trade-off

Abstract

Querying the content of images, video, and other non-textual data sources requires expensive content extraction methods. Modern extraction techniques are based on deep convolutional neural networks (CNNs) and can classify objects within images with astounding accuracy. Unfortunately, these methods are slow: processing a single image can take about 10 milliseconds on modern GPU-based hardware. As massive video libraries become ubiquitous, running a content-based query over millions of video frames is prohibitive. One promising approach to reduce the runtime cost of queries of visual content is to use a hierarchical model, such as a cascade, where simple cases are handled by an inexpensive classifier. Prior work has sought to design cascades that optimize the computational cost of inference by, for example, using smaller CNNs. However, we observe that there are critical factors besides…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings