UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos

Yuting Mei; Linli Yao; Qin Jin

arXiv:2406.16301·cs.CV·June 25, 2024

UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos

Yuting Mei, Linli Yao, Qin Jin

PDF

1 Repo

TL;DR

UBiSS introduces a unified framework for bimodal video summarization that generates both visual and textual summaries simultaneously, leveraging a large-scale dataset and a novel evaluation metric to improve semantic content preservation.

Contribution

The paper presents a new large-scale dataset BIDS and a unified model UBiSS for bimodal semantic video summarization, advancing beyond traditional unimodal and multi-stage methods.

Findings

01

UBiSS outperforms multi-stage pipelines in summarization quality.

02

The BIDS dataset effectively captures salient content in long videos.

03

The proposed NDCG_MS metric provides comprehensive evaluation of bimodal summaries.

Abstract

With the surge in the amount of video data, video summarization techniques, including visual-modal(VM) and textual-modal(TM) summarization, are attracting more and more attention. However, unimodal summarization inevitably loses the rich semantics of the video. In this paper, we focus on a more comprehensive video summarization task named Bimodal Semantic Summarization of Videos (BiSSV). Specifically, we first construct a large-scale dataset, BIDS, in (video, VM-Summary, TM-Summary) triplet format. Unlike traditional processing methods, our construction procedure contains a VM-Summary extraction algorithm aiming to preserve the most salient content within long videos. Based on BIDS, we propose a Unified framework UBiSS for the BiSSV task, which models the saliency information in the video and generates a TM-summary and VM-summary simultaneously. We further optimize our model with a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

meiyutingg/ubiss
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus