Live or Lie: Action-Aware Capsule Multiple Instance Learning for Risk Assessment in Live Streaming Platforms

Yiran Qiao; Jing Chen; Xiang Ao; Qiwei Zhong; Yang Liu; Qing He

arXiv:2602.03520·cs.LG·February 12, 2026

Live or Lie: Action-Aware Capsule Multiple Instance Learning for Risk Assessment in Live Streaming Platforms

Yiran Qiao, Jing Chen, Xiang Ao, Qiwei Zhong, Yang Liu, Qing He

PDF

Open Access

TL;DR

This paper introduces AC-MIL, a novel action-aware capsule multiple instance learning framework for risk assessment in live streaming platforms, effectively detecting coordinated malicious behaviors with interpretability and state-of-the-art accuracy.

Contribution

It formulates live stream risk detection as a weakly supervised MIL problem and proposes AC-MIL, which models both individual and group behaviors with interpretability.

Findings

01

AC-MIL outperforms existing MIL and sequential models on large-scale datasets.

02

It provides interpretable evidence at the behavior segment level.

03

Achieves state-of-the-art accuracy in room-level risk assessment.

Abstract

Live streaming has become a cornerstone of today's internet, enabling massive real-time social interactions. However, it faces severe risks arising from sparse, coordinated malicious behaviors among multiple participants, which are often concealed within normal activities and challenging to detect timely and accurately. In this work, we provide a pioneering study on risk assessment in live streaming rooms, characterized by weak supervision where only room-level labels are available. We formulate the task as a Multiple Instance Learning (MIL) problem, treating each room as a bag and defining structured user-timeslot capsules as instances. These capsules represent subsequences of user actions within specific time windows, encapsulating localized behavioral patterns. Based on this formulation, we propose AC-MIL, an Action-aware Capsule MIL framework that models both individual behaviors…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Digital Mental Health Interventions · Mobile Crowdsensing and Crowdsourcing