CHAMP: Characterizing Undesired App Behaviors from User Comments based on Market Policies
Yangyu Hu, Haoyu Wang, Tiantong Ji, Xusheng Xiao, Xiapu Luo, Peng Gao, and Yao Guo

TL;DR
This paper introduces CHAMP, a text mining approach that analyzes user comments to detect and characterize app policy violations across large-scale app markets, revealing widespread issues despite vetting efforts.
Contribution
CHAMP is the first large-scale method to classify user comments into undesired behaviors and link them to app policy violations using NLP techniques.
Findings
Achieves over 0.9 precision and recall in classifying undesired comments.
Identifies widespread policy violations across multiple app markets.
Curates a dataset of over 3 million comments from various Android app markets.
Abstract
Millions of mobile apps have been available through various app markets. Although most app markets have enforced a number of automated or even manual mechanisms to vet each app before it is released to the market, thousands of low-quality apps still exist in different markets, some of which violate the explicitly specified market policies.In order to identify these violations accurately and timely, we resort to user comments, which can form an immediate feedback for app market maintainers, to identify undesired behaviors that violate market policies, including security-related user concerns. Specifically, we present the first large-scale study to detect and characterize the correlations between user comments and market policies. First, we propose CHAMP, an approach that adopts text mining and natural language processing (NLP) techniques to extract semantic rules through a semi-automated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Software Engineering Research · Spam and Phishing Detection
