Extract-Transform-Load for Video Streams

Ferdinand Kossmann; Ziniu Wu; Eugenie Lai; Nesime Tatbul; Lei Cao; Tim; Kraska; Samuel Madden

arXiv:2310.04830·cs.DB·June 21, 2024

Extract-Transform-Load for Video Streams

Ferdinand Kossmann, Ziniu Wu, Eugenie Lai, Nesime Tatbul, Lei Cao, Tim, Kraska, Samuel Madden

PDF

1 Repo

TL;DR

This paper introduces Skyscraper, a system for cost-effective, scalable video data transformation and querying, which adaptively optimizes ingestion pipelines to reduce costs while maintaining throughput and quality guarantees.

Contribution

The paper presents Skyscraper, a novel system that efficiently manages large-scale video ingestion by adaptive tuning and cloud bursting, addressing limitations of existing systems.

Findings

01

Skyscraper reduces V-ETL ingestion costs significantly.

02

It maintains throughput and quality guarantees during variable workloads.

03

The system effectively balances on-premises and cloud resources.

Abstract

Social media, self-driving cars, and traffic cameras produce video streams at large scales and cheap cost. However, storing and querying video at such scales is prohibitively expensive. We propose to treat large-scale video analytics as a data warehousing problem: Video is a format that is easy to produce but needs to be transformed into an application-specific format that is easy to query. Analogously, we define the problem of Video Extract-Transform-Load (V-ETL). V-ETL systems need to reduce the cost of running a user-defined V-ETL job while also giving throughput guarantees to keep up with the rate at which data is produced. We find that no current system sufficiently fulfills both needs and therefore propose Skyscraper, a system tailored to V-ETL. Skyscraper can execute arbitrary video ingestion pipelines and adaptively tunes them to reduce cost at minimal or no quality degradation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ferdiko/vetl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.