Serverless Federated AUPRC Optimization for Multi-Party Collaborative   Imbalanced Data Mining

Xidong Wu; Zhengmian Hu; Jian Pei; Heng Huang

arXiv:2308.03035·cs.LG·August 8, 2023

Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining

Xidong Wu, Zhengmian Hu, Jian Pei, Heng Huang

PDF

Open Access 1 Repo

TL;DR

This paper introduces the first multi-party federated learning algorithm optimized for imbalanced data using AUPRC, reducing communication costs and improving convergence rates.

Contribution

It proposes the SLATE and SLATE-M algorithms for serverless multi-party AUPRC maximization, addressing a novel problem in federated learning.

Findings

01

First multi-party AUPRC maximization algorithm.

02

SLATE-M achieves convergence rates matching single-machine methods.

03

Reduces communication costs in federated learning.

Abstract

Multi-party collaborative training, such as distributed learning and federated learning, is used to address the big data challenges. However, traditional multi-party collaborative training algorithms were mainly designed for balanced data mining tasks and are intended to optimize accuracy (\emph{e.g.}, cross-entropy). The data distribution in many real-world applications is skewed and classifiers, which are trained to improve accuracy, perform poorly when applied to imbalanced data tasks since models could be significantly biased toward the primary class. Therefore, the Area Under Precision-Recall Curve (AUPRC) was introduced as an effective metric. Although single-machine AUPRC maximization methods have been designed, multi-party collaborative algorithm has never been studied. The change from the single-machine to the multi-party setting poses critical challenges. To address the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xidongwu/d-auprc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Face and Expression Recognition · Imbalanced Data Classification Techniques