Enhancing Data Provenance and Model Transparency in Federated Learning Systems -- A Database Approach
Michael Gu, Ramasoumya Naraparaju, Dongfang Zhao

TL;DR
This paper introduces a novel database-based approach to improve data provenance and model transparency in federated learning, enhancing accountability and trustworthiness without significant computational overhead.
Contribution
It presents an innovative method combining cryptographic techniques and model management to track data transformations and improve transparency in federated learning systems.
Findings
Enhanced data transparency through cryptographic hashes and model snapshots
Efficient data provenance tracking with minimal computational impact
Improved trust and accountability in privacy-sensitive FL applications
Abstract
Federated Learning (FL) presents a promising paradigm for training machine learning models across decentralized edge devices while preserving data privacy. Ensuring the integrity and traceability of data across these distributed environments, however, remains a critical challenge. The ability to create transparent artificial intelligence, such as detailing the training process of a machine learning model, has become an increasingly prominent concern due to the large number of sensitive (hyper)parameters it utilizes; thus, it is imperative to strike a reasonable balance between openness and the need to protect sensitive information. In this paper, we propose one of the first approaches to enhance data provenance and model transparency in federated learning systems. Our methodology leverages a combination of cryptographic techniques and efficient model management to track the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Data Security Solutions · Data Quality and Management · Privacy-Preserving Technologies in Data
