TennisVid2Text: Fine-grained Descriptions for Domain Specific Videos

Mohak Sukhwani; C.V. Jawahar

arXiv:1511.08522·cs.CV·November 30, 2015·1 cites

TennisVid2Text: Fine-grained Descriptions for Domain Specific Videos

Mohak Sukhwani, C.V. Jawahar

PDF

Open Access

TL;DR

This paper presents TennisVid2Text, a system for generating detailed, human-like textual descriptions of lawn tennis match videos by analyzing actions and interactions, leveraging a large corpus of human descriptions.

Contribution

It introduces a domain-specific approach for detailed video description in sports, utilizing a large corpus and low-level analysis to improve semantic richness and readability.

Findings

01

Effective in generating semantically rich descriptions

02

Addresses both correctness and readability

03

Evaluated on a new tennis video dataset

Abstract

Automatically describing videos has ever been fascinating. In this work, we attempt to describe videos from a specific domain - broadcast videos of lawn tennis matches. Given a video shot from a tennis match, we intend to generate a textual commentary similar to what a human expert would write on a sports website. Unlike many recent works that focus on generating short captions, we are interested in generating semantically richer descriptions. This demands a detailed low-level analysis of the video content, specially the actions and interactions among subjects. We address this by limiting our domain to the game of lawn tennis. Rich descriptions are generated by leveraging a large corpus of human created descriptions harvested from Internet. We evaluate our method on a newly created tennis video data set. Extensive analysis demonstrate that our approach addresses both semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Human Pose and Action Recognition · Multimodal Machine Learning Applications