TransNet V2: An effective deep network architecture for fast shot   transition detection

Tom\'a\v{s} Sou\v{c}ek; Jakub Loko\v{c}

arXiv:2008.04838·cs.CV·August 12, 2020·55 cites

TransNet V2: An effective deep network architecture for fast shot transition detection

Tom\'a\v{s} Sou\v{c}ek, Jakub Loko\v{c}

PDF

Open Access 4 Repos 1 Datasets

TL;DR

TransNet V2 is a deep learning model that achieves state-of-the-art accuracy in fast shot transition detection, offering an efficient tool for analyzing large video archives with detailed architecture and training insights.

Contribution

The paper introduces TransNet V2, a novel deep network architecture that improves shot transition detection accuracy and provides practical implementation details and trained models for community use.

Findings

01

Achieves state-of-the-art performance on benchmark datasets

02

Provides a trained model for instant application

03

Details architecture and training process for reproducibility

Abstract

Although automatic shot transition detection approaches are already investigated for more than two decades, an effective universal human-level model was not proposed yet. Even for common shot transitions like hard cuts or simple gradual changes, the potential diversity of analyzed video contents may still lead to both false hits and false dismissals. Recently, deep learning-based approaches significantly improved the accuracy of shot transition detection using 3D convolutional architectures and artificially created training data. Nevertheless, one hundred percent accuracy is still an unreachable ideal. In this paper, we share the current version of our deep network TransNet V2 that reaches state-of-the-art performance on respected benchmarks. A trained instance of the model is provided so it can be instantly utilized by the community for a highly efficient analysis of large video…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

sprited/sprite-dx-data
dataset· 68 dl
68 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Human Pose and Action Recognition · Advanced Vision and Imaging