Building cross-language corpora for human understanding of privacy   policies

Francesco Ciclosi; Silvia Vidor; and Fabio Massacci

arXiv:2302.05355·cs.CR·February 13, 2023·1 cites

Building cross-language corpora for human understanding of privacy policies

Francesco Ciclosi, Silvia Vidor, and Fabio Massacci

PDF

Open Access

TL;DR

This paper presents a methodology for creating comparable cross-language privacy policy corpora to improve understanding across different languages, demonstrated through an English-Italian comparison.

Contribution

It introduces a novel methodology for building cross-language privacy policy corpora and applies it to English and Italian, facilitating multilingual user understanding studies.

Findings

01

Extended privacy policy corpus for English and Italian

02

Identified challenges in replicating privacy understanding studies across languages

03

Provided a framework for cross-language corpus construction

Abstract

Making sure that users understand privacy policies that impact them is a key challenge for a real GDPR deployment. Research studies are mostly carried in English, but in Europe and elsewhere, users speak a language that is not English. Replicating studies in different languages requires the availability of comparable cross-language privacy policies corpora. This work provides a methodology for building comparable cross-language in a national language and a reference study language. We provide an application example of our methodology comparing English and Italian extending the corpus of one of the first studies about users understanding of technical terms in privacy policies. We also investigate other open issues that can make replication harder.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy, Security, and Data Protection · Ethics and Social Impacts of AI · Privacy-Preserving Technologies in Data