Minimizing Risk Through Minimizing Model-Data Interaction: A Protocol For Relying on Proxy Tasks When Designing Child Sexual Abuse Imagery Detection Models
Thamiris Coelho, Leo S. F. Ribeiro, Jo\~ao Macedo, Jefersson A. dos Santos, Sandra Avila

TL;DR
This paper introduces a protocol for designing CSAI detection models using proxy tasks to minimize direct data interaction, enhancing privacy and security.
Contribution
It formalizes the concept of proxy tasks for CSAI detection and demonstrates their effective use in a real-world application without training on sensitive data.
Findings
Achieved promising results on a real-world CSAI dataset.
First application of proxy tasks in CSAI detection.
Demonstrated that models can perform well without direct access to sensitive data.
Abstract
The distribution of child sexual abuse imagery (CSAI) is an ever-growing concern of our modern world; children who suffered from this heinous crime are revictimized, and the growing amount of illegal imagery distributed overwhelms law enforcement agents (LEAs) with the manual labor of categorization. To ease this burden researchers have explored methods for automating data triage and detection of CSAI, but the sensitive nature of the data imposes restricted access and minimal interaction between real data and learning algorithms, avoiding leaks at all costs. In observing how these restrictions have shaped the literature we formalize a definition of "Proxy Tasks", i.e., the substitute tasks used for training models for CSAI without making use of CSA data. Under this new terminology we review current literature and present a protocol for making conscious use of Proxy Tasks together with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
