Machine learning and semantic analysis of in-game chat for cyberbullying

Shane Murnion; William J. Buchanan; Adrian Smales; Gordon Russell

arXiv:1907.10855·cs.CR·July 26, 2019

Machine learning and semantic analysis of in-game chat for cyberbullying

Shane Murnion, William J. Buchanan, Adrian Smales, Gordon Russell

PDF

TL;DR

This study develops an automated system to collect and analyze in-game chat data from World of Tanks, identifying cyberbullying patterns and proposing mitigation strategies, with findings on the effectiveness of simple classification methods and player behavior insights.

Contribution

Introduces a novel automated data collection and classification framework for cyberbullying detection in online gaming, comparing SQL-based and AI sentiment analysis methods.

Findings

01

SQL classification effectively detects toxic language

02

AI sentiment analysis underperforms in this context

03

Player behavior analysis suggests new players are less likely to cyberbully

Abstract

One major problem with cyberbullying research is the lack of data, since researchers are traditionally forced to rely on survey data where victims and perpetrators self-report their impressions. In this paper, an automatic data collection system is presented that continuously collects in-game chat data from one of the most popular online multi-player games: World of Tanks. The data was collected and combined with other information about the players from available online data services. It presents a scoring scheme to enable identification of cyberbullying based on current research. Classification of the collected data was carried out using simple feature detection with SQL database queries and compared to classification from AI-based sentiment text analysis services that have recently become available and further against manually classified data using a custom-built classification client…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.