Semi-supervised 3D Video Information Retrieval with Deep Neural Network and Bi-directional Dynamic-time Warping Algorithm
Yintai Ma, Diego Klabjan

TL;DR
This paper introduces a semi-supervised deep neural network approach combined with bi-directional dynamic time warping for efficient and accurate retrieval of similar 2D and 3D videos from large datasets, outperforming existing methods.
Contribution
It proposes a novel semi-supervised deep learning algorithm integrating autoencoders and bi-directional dynamic time warping for improved video retrieval accuracy.
Findings
Outperforms state-of-the-art video retrieval models
Effective on multiple public datasets including CC_WEB_VIDEO and Youtube-8m
Handles large-scale video datasets efficiently
Abstract
This paper presents a novel semi-supervised deep learning algorithm for retrieving similar 2D and 3D videos based on visual content. The proposed approach combines the power of deep convolutional and recurrent neural networks with dynamic time warping as a similarity measure. The proposed algorithm is designed to handle large video datasets and retrieve the most related videos to a given inquiry video clip based on its graphical frames and contents. We split both the candidate and the inquiry videos into a sequence of clips and convert each clip to a representation vector using an autoencoder-backed deep neural network. We then calculate a similarity measure between the sequences of embedding vectors using a bi-directional dynamic time-warping method. This approach is tested on multiple public datasets, including CC\_WEB\_VIDEO, Youtube-8m, S3DIS, and Synthia, and showed good results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Music and Audio Processing · Advanced Image and Video Retrieval Techniques
