A Large-scale Dataset with Behavior, Attributes, and Content of Mobile Short-video Platform
Yu Shang, Chen Gao, Nian Li, Yong Li

TL;DR
This paper introduces a comprehensive large-scale dataset from a mobile short-video platform, capturing user behavior, attributes, and content, to facilitate research in recommendation systems, social science, and human behavior analysis.
Contribution
It provides a rich, large-scale dataset addressing gaps in existing data, with extensive user-video interaction data, attributes, and content features, validated through multiple technical assessments.
Findings
Dataset covers 10,000 users and 153,561 videos.
Benchmarking of recommendation algorithms demonstrates dataset's utility.
Analysis of filter bubble phenomenon using the dataset.
Abstract
Short-video platforms show an increasing impact on people's daily lives nowadays, with billions of active users spending plenty of time each day. The interactions between users and online platforms give rise to many scientific problems across computational social science and artificial intelligence. However, despite the rapid development of short-video platforms, currently there are serious shortcomings in existing relevant datasets on three aspects: inadequate user-video feedback, limited user attributes and lack of video content. To address these problems, we provide a large-scale dataset with rich user behavior, attributes and video content from a real mobile short-video platform. This dataset covers 10,000 voluntary users and 153,561 videos, and we conduct four-fold technical validations of the dataset. First, we verify the richness of the behavior and attribute data. Second, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Computing and Algorithms
