ViP: Video Platform for PyTorch
Madan Ravi Ganesh, Eric Hofesmann, Nathan Louis, Jason Corso

TL;DR
ViP is a flexible, modular deep learning platform built on PyTorch that simplifies video model development, supports large-batch processing, and promotes cross-domain research in video understanding.
Contribution
Introduces ViP, a unified, extensible platform for video deep learning that enhances prototyping, efficiency, and reproducibility across various video problem domains.
Findings
Supports all video problem domains with a single interface
Enables quick prototyping of video models
Reduces memory usage for large-batch operations
Abstract
This work presents the Video Platform for PyTorch (ViP), a deep learning-based framework designed to handle and extend to any problem domain based on videos. ViP supports (1) a single unified interface applicable to all video problem domains, (2) quick prototyping of video models, (3) executing large-batch operations with reduced memory consumption, and (4) easy and reproducible experimental setups. ViP's core functionality is built with flexibility and modularity in mind to allow for smooth data flow between different parts of the platform and benchmarking against existing methods. In providing a software platform that supports multiple video-based problem domains, we allow for more cross-pollination of models, ideas and stronger generalization in the video understanding research community.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning
