LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos
Jielin Qiu, Franck Dernoncourt, Trung Bui, Zhaowen Wang, Ding Zhao,, Hailin Jin

TL;DR
This paper introduces LiveSeg, an unsupervised multimodal approach for segmenting long Livestream videos into topics, improving segmentation accuracy significantly and aiding viewers in quickly understanding lengthy tutorial content.
Contribution
The paper presents a new large Livestream video dataset MultiLive and a novel unsupervised segmentation method that leverages multimodal features, outperforming existing methods.
Findings
Achieved 16.8% F1-score improvement over state-of-the-art.
Created the MultiLive dataset for Livestream video analysis.
Demonstrated effectiveness of multimodal features in segmentation.
Abstract
Livestream videos have become a significant part of online learning, where design, digital marketing, creative painting, and other skills are taught by experienced experts in the sessions, making them valuable materials. However, Livestream tutorial videos are usually hours long, recorded, and uploaded to the Internet directly after the live sessions, making it hard for other people to catch up quickly. An outline will be a beneficial solution, which requires the video to be temporally segmented according to topics. In this work, we introduced a large Livestream video dataset named MultiLive, and formulated the temporal segmentation of the long Livestream videos (TSLLV) task. We propose LiveSeg, an unsupervised Livestream video temporal Segmentation solution, which takes advantage of multimodal features from different domains. Our method achieved a F1-score performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos· youtube
Taxonomy
TopicsVideo Analysis and Summarization · Image and Video Quality Assessment · Visual Attention and Saliency Detection
