LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long   Livestream Videos

Jielin Qiu; Franck Dernoncourt; Trung Bui; Zhaowen Wang; Ding Zhao,; Hailin Jin

arXiv:2210.05840·cs.CV·October 13, 2022

LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos

Jielin Qiu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Ding Zhao,, Hailin Jin

PDF

Open Access 1 Video

TL;DR

This paper introduces LiveSeg, an unsupervised multimodal approach for segmenting long Livestream videos into topics, improving segmentation accuracy significantly and aiding viewers in quickly understanding lengthy tutorial content.

Contribution

The paper presents a new large Livestream video dataset MultiLive and a novel unsupervised segmentation method that leverages multimodal features, outperforming existing methods.

Findings

01

Achieved 16.8% F1-score improvement over state-of-the-art.

02

Created the MultiLive dataset for Livestream video analysis.

03

Demonstrated effectiveness of multimodal features in segmentation.

Abstract

Livestream videos have become a significant part of online learning, where design, digital marketing, creative painting, and other skills are taught by experienced experts in the sessions, making them valuable materials. However, Livestream tutorial videos are usually hours long, recorded, and uploaded to the Internet directly after the live sessions, making it hard for other people to catch up quickly. An outline will be a beneficial solution, which requires the video to be temporally segmented according to topics. In this work, we introduced a large Livestream video dataset named MultiLive, and formulated the temporal segmentation of the long Livestream videos (TSLLV) task. We propose LiveSeg, an unsupervised Livestream video temporal Segmentation solution, which takes advantage of multimodal features from different domains. Our method achieved a $16.8%$ F1-score performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos· youtube

Taxonomy

TopicsVideo Analysis and Summarization · Image and Video Quality Assessment · Visual Attention and Saliency Detection