Empowering Data Mesh with Federated Learning
Haoyuan Li, Salman Toor

TL;DR
This paper introduces a novel approach combining federated learning with Data Mesh architecture to enable privacy-preserving, decentralized data analysis across multiple domains, addressing limitations of traditional centralized machine learning.
Contribution
It is the first open-source work integrating federated learning into Data Mesh, advancing privacy-preserving decentralized data analysis methods.
Findings
First open-source implementation of federated learning in Data Mesh
Demonstrates enhanced privacy and decentralization in multi-domain data analysis
Paves the way for secure, scalable data analysis in organizations
Abstract
The evolution of data architecture has seen the rise of data lakes, aiming to solve the bottlenecks of data management and promote intelligent decision-making. However, this centralized architecture is limited by the proliferation of data sources and the growing demand for timely analysis and processing. A new data paradigm, Data Mesh, is proposed to overcome these challenges. Data Mesh treats domains as a first-class concern by distributing the data ownership from the central team to each data domain, while keeping the federated governance to monitor domains and their data products. Many multi-million dollar organizations like Paypal, Netflix, and Zalando have already transformed their data analysis pipelines based on this new architecture. In this decentralized architecture where data is locally preserved by each domain team, traditional centralized machine learning is incapable of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
