Scalable Pooled Time Series of Big Video Data from the Deep Web

Chris Mattmann; Madhav Sharan

arXiv:1610.06669·cs.CV·October 24, 2016

Scalable Pooled Time Series of Big Video Data from the Deep Web

Chris Mattmann, Madhav Sharan

PDF

Open Access

TL;DR

This paper presents a scalable Hadoop-based implementation of the Pooled Time Series algorithm, enabling analysis of large-scale video datasets from the deep web, with applications in human trafficking investigations.

Contribution

It introduces a parallelized, scalable version of Pooled Time Series for large datasets, addressing challenges of processing big video data efficiently.

Findings

01

The Hadoop-based algorithm performs well on 6800 videos.

02

The implementation maintains the properties of the original algorithm.

03

Solutions for issues encountered on large datasets are discussed.

Abstract

We contribute a scalable implementation of Ryoo et al's Pooled Time Series algorithm from CVPR 2015. The updated algorithm has been evaluated on a large and diverse dataset of approximately 6800 videos collected from a crawl of the deep web related to human trafficking on DARPA's MEMEX effort. We describe the properties of Pooled Time Series and the motivation for using it to relate videos collected from the deep web. We highlight issues that we found while running Pooled Time Series on larger datasets and discuss solutions for those issues. Our solution centers are re-imagining Pooled Time Series as a Hadoop-based algorithm in which we compute portions of the eventual solution in parallel on large commodity clusters. We demonstrate that our new Hadoop-based algorithm works well on the 6800 video dataset and shares all of the properties described in the CVPR 2015 paper. We suggest…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Data Visualization and Analytics · Video Analysis and Summarization