Multi-Modal Machine Learning for Assessing Gaming Skills in Online Streaming: A Case Study with CS:GO
Longxiang Zhang, Wenping Wang

TL;DR
This paper explores multi-modal machine learning techniques to assess gaming skills from online streaming videos, focusing on vision, audio, and text modalities, with a case study on CS:GO.
Contribution
It introduces new end-to-end models for joint multi-modal representation and addresses dataset flaws, demonstrating their effectiveness in skill assessment.
Findings
Proposed models improve multi-modal representation learning.
Dataset flaws significantly impact model performance.
Models tend to identify users rather than learn meaningful skills.
Abstract
Online streaming is an emerging market that address much attention. Assessing gaming skills from videos is an important task for streaming service providers to discover talented gamers. Service providers require the information to offer customized recommendation and service promotion to their customers. Meanwhile, this is also an important multi-modal machine learning tasks since online streaming combines vision, audio and text modalities. In this study we begin by identifying flaws in the dataset and proceed to clean it manually. Then we propose several variants of latest end-to-end models to learn joint representation of multiple modalities. Through our extensive experimentation, we demonstrate the efficacy of our proposals. Moreover, we identify that our proposed models is prone to identifying users instead of learning meaningful representations. We purpose future work to address the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Games and Media · Video Analysis and Summarization · Gambling Behavior and Treatments
Methodstravel james
