Mayfly: Private Aggregate Insights from Ephemeral Streams of On-Device User Data
Christopher Bian, Albert Cheu, Stanislav Chiknavaryan, Zoe Gong, Marco Gruteser, Oliver Guinan, Yannis Guzman, Peter Kairouz, Artem Lagzdin, Ryan McKenna, Grace Ni, Edo Roth, Maya Spivak, Timon Van Overveldt, Ren Yi

TL;DR
Mayfly is a federated analytics system that enables privacy-preserving aggregate insights from ephemeral on-device data streams, using differential privacy and in-memory cross-device aggregation, demonstrated on a large-scale transportation emissions use case.
Contribution
Introduces Mayfly, a novel federated analytics approach that ensures privacy and utility for ephemeral on-device data streams through innovative DP mechanisms and in-memory aggregation.
Findings
Processed over 4 million statistics across 500 million devices.
Achieved differential privacy with ε=2 per device per week.
Successfully estimated transportation emissions with high utility.
Abstract
This paper introduces Mayfly, a federated analytics approach enabling aggregate queries over ephemeral on-device data streams without central persistence of sensitive user data. Mayfly minimizes data via on-device windowing and contribution bounding through SQL-programmability, anonymizes user data via streaming differential privacy (DP), and mandates immediate in-memory cross-device aggregation on the server -- ensuring only privatized aggregates are revealed to data analysts. Deployed for a sustainability use case estimating transportation carbon emissions from private location data, Mayfly computed over 4 million statistics across more than 500 million devices with a per-device, per-week DP while meeting strict data utility requirements. To achieve this, we designed a new DP mechanism for Group-By-Sum workloads leveraging statistical properties of location data,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Privacy, Security, and Data Protection · Internet Traffic Analysis and Secure E-voting
