Fair Diversity Maximization with Few Representatives

Florian Adriaens; Nikolaj Tatti

arXiv:2506.08110·cs.DS·June 11, 2025

Fair Diversity Maximization with Few Representatives

Florian Adriaens, Nikolaj Tatti

PDF

TL;DR

This paper introduces a randomized algorithm for fair diversity maximization that improves approximation ratios, especially when selecting few representatives per group, and demonstrates its effectiveness on large datasets.

Contribution

The paper presents a novel randomized algorithm with improved approximation guarantees for fair diversity maximization with few representatives, using padded decompositions and clustering techniques.

Findings

01

Improved approximation ratio of .5f3(\u00b5) for the problem.

02

Algorithm effectively handles large datasets with fair representation constraints.

03

Experimental results confirm the algorithm's practical efficiency.

Abstract

Diversity maximization problem is a well-studied problem where the goal is to find $k$ diverse items. Fair diversity maximization aims to select a diverse subset of $k$ items from a large dataset, while requiring that each group of items be well represented in the output. More formally, given a set of items with labels, our goal is to find $k$ items that maximize the minimum pairwise distance in the set, while maintaining that each label is represented within some budget. In many cases, one is only interested in selecting a handful (say a constant) number of items from each group. In such scenario we show that a randomized algorithm based on padded decompositions improves the state-of-the-art approximation ratio to $lo g (m) / (3 m)$ , where $m$ is the number of labels. The algorithms work in several stages: ( $i$ ) a preprocessing pruning which ensures that points with the same label…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.