# Implicit Diversity in Image Summarization

**Authors:** L. Elisa Celis, Vijay Keswani

arXiv: 1901.10265 · 2020-08-18

## TL;DR

This paper introduces a novel method for implicitly diversifying image search results by using a visibly diverse control set, without relying on attribute labels, to better reflect real-world diversity in the results.

## Contribution

The authors propose a label-free approach that leverages a visibly diverse control set to select more representative and diverse images in search results, improving diversity without additional bias.

## Key findings

- Significantly improves visible diversity in image search results.
- Maintains high accuracy while enhancing diversity.
- Effective on datasets with occupation and facial attribute diversity.

## Abstract

Studies have shown that the people depicted in image search results tend to be of majority groups with respect to socially salient attributes. This skew goes beyond that which already exists in the world - e.g., Kay et al. showed that although 28% of CEOs in US are women, only 10% of the top 100 results for CEO in Google Image Search are women. Most existing approaches to correct for this kind of bias assume that the images of people include socially salient attribute labels. However, such labels are often unknown. Further, using automated techniques to infer these labels may often not be possible within acceptable accuracy ranges, and may not be desirable due to the additional biases this process could incur. We develop a novel approach that takes as input a visibly diverse control set of images and uses this set to select a set of images of people in response to a query. The goal is to have a resulting set that is more visibly diverse in a manner that emulates the diversity depicted in the control set. Importantly, this approach does not require images to be labelled at any point; effectively, it gives a way to implicitly diversify the set of images selected. We provide two variants of our approach: the first is a modification of the MMR algorithm to incorporate the diversity scores, and second is a more efficient variant that does not consider within-list redundancy. We evaluate these approaches empirically on two datasets 1) a new dataset containing top Google image results for 96 occupations, for which we evaluate gender and skin-tone diversity with respect to occupations and 2) the CelebA dataset for which we evaluate gender diversity with respect to facial features. Our approaches produce image sets that significantly improve the visible diversity of the results, compared to current Google search and other diverse image summarization algorithms, at a minimal cost to accuracy.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.10265/full.md

## Figures

65 figures with captions in the complete paper: https://tomesphere.com/paper/1901.10265/full.md

## References

97 references — full list in the complete paper: https://tomesphere.com/paper/1901.10265/full.md

---
Source: https://tomesphere.com/paper/1901.10265