TL;DR
This paper presents SkillVet, a machine learning-based tool for analyzing the privacy practices of over 199,000 Amazon Alexa skills, revealing significant privacy risks and developer behaviors that bypass permission systems.
Contribution
The paper introduces SkillVet, a novel scalable methodology combining NLP and machine learning to identify privacy issues and broken permissions in the Alexa skill ecosystem.
Findings
43% of skills request permissions with poor privacy practices
50% of developers exhibit bad privacy behaviors
13% of issues were resolved after disclosure
Abstract
Third-party software, or skills, are essential components in Smart Personal Assistants (SPA). The number of skills has grown rapidly, dominated by a changing environment that has no clear business model. Skills can access personal information and this may pose a risk to users. However, there is little information about how this ecosystem works, let alone the tools that can facilitate its study. In this paper, we present the largest systematic measurement of the Amazon Alexa skill ecosystem to date. We study developers' practices in this ecosystem, including how they collect and justify the need for sensitive information, by designing a methodology to identify over-privileged skills with broken privacy policies. We collect 199,295 Alexa skills and uncover that around 43% of the skills (and 50% of the developers) that request these permissions follow bad privacy practices, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
