Federated Binary Matrix Factorization using Proximal Optimization

Sebastian Dalleiger; Jilles Vreeken; Michael Kamp

arXiv:2407.01776·cs.LG·July 3, 2024

Federated Binary Matrix Factorization using Proximal Optimization

Sebastian Dalleiger, Jilles Vreeken, Michael Kamp

PDF

Open Access 1 Video 3 Reviews

TL;DR

This paper introduces a federated learning approach for binary matrix factorization that preserves data privacy, using proximal optimization and differential privacy guarantees, and demonstrates superior performance over existing methods.

Contribution

It develops a novel federated binary matrix factorization method with proximal optimization, ensuring privacy and convergence, and shows improved results on real and synthetic data.

Findings

01

Outperforms state-of-the-art federated BMF methods in quality and efficacy.

02

Provides convergence guarantees for the proposed algorithm.

03

Ensures differential privacy in federated binary matrix factorization.

Abstract

Identifying informative components in binary data is an essential task in many research areas, including life sciences, social sciences, and recommendation systems. Boolean matrix factorization (BMF) is a family of methods that performs this task by efficiently factorizing the data. In real-world settings, the data is often distributed across stakeholders and required to stay private, prohibiting the straightforward application of BMF. To adapt BMF to this context, we approach the problem from a federated-learning perspective, while building on a state-of-the-art continuous binary matrix factorization relaxation to BMF that enables efficient gradient-based optimization. We propose to only share the relaxed component matrices, which are aggregated centrally using a proximal operator that regularizes for binary outcomes. We show the convergence of our federated proximal gradient descent…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

- the authors bring a novel algorithm designed for binary matrix factorization. The previous federated factorization algorithms were not for the binary/boolean matrices. - the paper is well-written, easy to follow.

Weaknesses

- The contribution is limited to the binary matrices, unless a further performance analysis on the non-binary matrices are provided. (the most of the datasets selected are by default not boolean matrices but converted to boolean) - The potential disadvantages of having a proximal operator is not being discussed. - The methods compared could be discussed further, for instance in Figure 6. the loss values for most of the models seem to be stuck at 1. What is the reason for that? What are the pa

Reviewer 02Rating 3· reject, not good enoughConfidence 4

Strengths

Presentation and clarity: The paper is overall well-motivated, the ideas are simple and easy to follow, and adequate background is given about binary matrix factorization, federated learning and differential privacy. The method seems like a reasonable extension of existing methods to the federated setting.

Weaknesses

The contribution of the paper seems limited. This is a straightforward extension of the method of Dalleiger et al. to the federated setting. Each client applies the same method, then the $V_i$ updates are aggregated on the server. The main algorithmic novelty seems to be to how aggregation is done, and the idea was to apply the proximal operator (the *same one used by Dalleiger et al.*) to the mean of the updates. The theoretical result (convergence of the iterates in Proposition 4.1) has sever

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

The algorithm is the first of its kind in the field of federated BMF. It establishes a new benchmark for future studies in this area. Additionally, the algorithm works well empirically, which could be useful in many real-world applications. The experimental results are comprehensive and clearly stated in this paper.

Weaknesses

The technical novelty of this paper is limited. The idea of using proximal gradient descent has been previously used by Dalleiger and Vreeken (2022). This paper merely transforms the previous algorithm into a federated version. The non-trivial part is the aggregation procedure. However, it is a heuristic and also increases the time complexity for the server. The paper does not theoretically show the convergence rate of the algorithm. Furthermore, the analysis relies on a strong assumption of the

Videos

Federated Binary Matrix Factorization Using Proximal Optimization· underline

Taxonomy

Topicsgraph theory and CDMA systems

MethodsSparse Evolutionary Training