Measuring Societal Biases from Text Corpora with Smoothed First-Order   Co-occurrence

Navid Rekabsaz; Robert West; James Henderson; Allan Hanbury

arXiv:1812.10424·cs.CL·April 28, 2021

Measuring Societal Biases from Text Corpora with Smoothed First-Order Co-occurrence

Navid Rekabsaz, Robert West, James Henderson, Allan Hanbury

PDF

TL;DR

This paper introduces a novel bias measurement method based on smoothed first-order co-occurrence relations, which better correlates with real-world gender bias statistics in occupational words than traditional embedding similarity methods.

Contribution

The study proposes an alternative bias measurement approach using first-order co-occurrence, improving correlation with actual societal biases over existing similarity-based methods.

Findings

01

First-order co-occurrence approach shows higher correlation with real-world gender bias statistics.

02

The method reveals more severe female bias in specific occupations.

03

Compared to traditional methods, the new approach reduces irrelevant concept influence.

Abstract

Text corpora are widely used resources for measuring societal biases and stereotypes. The common approach to measuring such biases using a corpus is by calculating the similarities between the embedding vector of a word (like nurse) and the vectors of the representative words of the concepts of interest (such as genders). In this study, we show that, depending on what one aims to quantify as bias, this commonly-used approach can introduce non-relevant concepts into bias measurement. We propose an alternative approach to bias measurement utilizing the smoothed first-order co-occurrence relations between the word and the representative concept words, which we derive by reconstructing the co-occurrence estimates inherent in word embedding models. We compare these approaches by conducting several experiments on the scenario of measuring gender bias of occupational words, according to an…

Tables3

Table 1. Table 1: Spearman ρ 𝜌 \rho and Pearson’s r 𝑟 r correlation results of the gender bias values, calculated with word representations, to the statistics of the portion of women in occupations

Order	Representation	Method	Labor Data		Census Data
Order	Representation	Method	Spearman $ρ$	Pearson’s $r$	Spearman $ρ$	Pearson’s $r$
High-Order	PMI	Directional	0.28	0.07	0.18	0.02
		Centroid	0.14	0.21	0.35	0.40
		${Average}_{High}$	0.33	0.24	0.27	0.19
\cdashline3-7	PMI-SVD	Directional	0.05	0.07	0.00	0.00
		Centroid	0.41	0.47	0.46	0.53
		${Average}_{High}$	0.41	0.49	0.49	0.56
\cdashline1-7 First-Order	PMI	${Average}_{First}$	0.53	0.51	0.57	0.62
High-Order	PPMI	Directional	0.45	0.49	0.39	0.47
		Centroid	0.43	0.46	0.45	0.50
		${Average}_{High}$	0.43	0.46	0.45	0.52
\cdashline3-7	PPMI-SVD	Directional	0.05	0.07	0.00	0.00
		Centroid	0.41	0.47	0.46	0.53
		${Average}_{High}$	0.41	0.49	0.49	0.56
\cdashline1-7 First-Order	PPMI	${Average}_{First}$	0.59	0.58	0.64	0.64
High-Order	SPPMI	Directional	0.26	0.37	0.26	0.28
		Centroid	0.39	0.45	0.45	0.48
		${Average}_{High}$	0.32	0.40	0.44	0.48
\cdashline3-7	SPPMI-SVD	Directional	0.17	0.29	0.11	0.03
		Centroid	0.28	0.35	0.39	0.43
		${Average}_{High}$	0.26	0.38	0.36	0.46
\cdashline1-7 First-Order	SPPMI	${Average}_{First}$	0.57	0.49	0.52	0.48
High-Order	GloVe	Directional	0.53	0.56	0.34	0.46
		Centroid	0.58	0.60	0.39	0.51
		${Average}_{High}$	0.60	0.60	0.39	0.51
\cdashline1-7 First-Order	initGlove	${Average}_{First}$	0.38	0.42	0.40	0.51
\cdashline1-7 First-Order	eGloVe	${Average}_{First}$	0.56	0.57	0.42	0.52
High-Order	SG	Directional	0.50	0.54	0.58	0.64
		Centroid	0.55	0.57	0.60	0.65
		${Average}_{High}$	0.55	0.57	0.59	0.65
\cdashline1-7 First-Order	eSG	${Average}_{First}$	0.66	0.61	0.67	0.70

Table 2. Table 2: Context-words with the highest effects on the calculated gender bias with the Average High subscript Average High \textsc{Average}_{\textsc{High}} method. Gender-neutral context-words are shown with underlines.

manicurist: businesswoman, nurse, Filipina, seamstress, matron

midwife: midwife, nurse, feminist, matron, suffragist

nurse: midwife, nurse, matron, nursing, Filipina

socialite: businesswoman, Filipina, suffragist, feminist, hostess

housekeeper: matron, midwife, nurse, maid, governess

captain: commanded, capt, quartermaster, enlisted, Hugh

colonel: commanded, Hugh, Ernest, guards, quartermaster

mechanician: apprenticed, Cyril, Ernest, Messrs, surveyor

lieutenant: commanded, Ernest, Hugh, enlisted, quartermaster

engineer: Jagmal, surveyor, apprenticed, draughtsman, engineer

Table 3. Table 3: Mean of the absolute values of bias changes in each step of corpus augmentation experiments, discussed

Corpus Augmentation	${Average}_{High}$ with SG	${Average}_{First}$ with eSG
First Step	0.037	0.044
Second Step	0.012	0.016

Equations22

ψ (w) = \frac{v _{d} v _{w}}{∥ v _{d} ∥}

ψ (w) = \frac{v _{d} v _{w}}{∥ v _{d} ∥}

v_{Z} = x \in Z \sum \frac{v _{x}}{∣ Z ∣}

v_{Z} = x \in Z \sum \frac{v _{x}}{∣ Z ∣}

ψ (w) = cosine (v_{Z}, v_{w}) - cosine (v_{Z^{'}}, v_{w})

ψ (w) = cosine (v_{Z}, v_{w}) - cosine (v_{Z^{'}}, v_{w})

\textsc A v er a g e_{\textsc H i g h} (w, Z) = \frac{1}{∣ Z ∣} x \in Z \sum cosine (v_{x}, v_{w})

\textsc A v er a g e_{\textsc H i g h} (w, Z) = \frac{1}{∣ Z ∣} x \in Z \sum cosine (v_{x}, v_{w})

ψ (w) = \textsc A v er a g e_{\textsc H i g h} (w, Z) - \textsc A v er a g e_{\textsc H i g h} (w, Z^{'})

ψ (w) = \textsc A v er a g e_{\textsc H i g h} (w, Z) - \textsc A v er a g e_{\textsc H i g h} (w, Z^{'})

e_{w : c} = σ (v_{w} u_{c}^{⊤}), e_{w} = σ (v_{w} U^{⊤}) \in R^{∣ V ∣}

e_{w : c} = σ (v_{w} u_{c}^{⊤}), e_{w} = σ (v_{w} U^{⊤}) \in R^{∣ V ∣}

v_{w} u_{c}^{⊤} + b_{w} + \tilde{b}_{c} \approx lo g # ⟨ w, c ⟩

v_{w} u_{c}^{⊤} + b_{w} + \tilde{b}_{c} \approx lo g # ⟨ w, c ⟩

e_{w : c} = v_{w} u_{c}^{⊤}, e_{w} = v_{w} U^{⊤} \in R^{∣ V ∣}

e_{w : c} = v_{w} u_{c}^{⊤}, e_{w} = v_{w} U^{⊤} \in R^{∣ V ∣}

\textsc A v er a g e_{\textsc F i r s t} (w, Z) = \frac{1}{∣ Z ∣} c \in Z \sum e_{w : c}

\textsc A v er a g e_{\textsc F i r s t} (w, Z) = \frac{1}{∣ Z ∣} c \in Z \sum e_{w : c}

ψ (w) = \textsc A v er a g e_{\textsc F i r s t} (w, Z) - \textsc A v er a g e_{\textsc F i r s t} (w, Z^{'})

ψ (w) = \textsc A v er a g e_{\textsc F i r s t} (w, Z) - \textsc A v er a g e_{\textsc F i r s t} (w, Z^{'})

e_{ψ} = \frac{1}{∣ Z ∣} z \in Z \sum \frac{e _{z} ⊙ e _{w}}{∥ e _{z} ∥∥ e _{w} ∥} - \frac{1}{∣ Z ^{'} ∣} z^{'} \in Z^{'} \sum \frac{e _{z^{'}} ⊙ e _{w}}{∥ e _{z^{'}} ∥∥ e _{w} ∥}

e_{ψ} = \frac{1}{∣ Z ∣} z \in Z \sum \frac{e _{z} ⊙ e _{w}}{∥ e _{z} ∥∥ e _{w} ∥} - \frac{1}{∣ Z ^{'} ∣} z^{'} \in Z^{'} \sum \frac{e _{z^{'}} ⊙ e _{w}}{∥ e _{z^{'}} ∥∥ e _{w} ∥}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Measuring Societal Biases from Text Corpora with Smoothed First-Order Co-occurrence

Navid Rekabsaz,1 Robert West,2 James Henderson,3 Allan Hanbury4

Abstract

Text corpora are widely used resources for measuring societal biases and stereotypes. The common approach to measuring such biases using a corpus is by calculating the similarities between the embedding vector of a word (like nurse) and the vectors of the representative words of the concepts of interest (such as genders). In this study, we show that, depending on what one aims to quantify as bias, this commonly-used approach can introduce non-relevant concepts into bias measurement. We propose an alternative approach to bias measurement utilizing the smoothed first-order co-occurrence relations between the word and the representative concept words, which we derive by reconstructing the co-occurrence estimates inherent in word embedding models. We compare these approaches by conducting several experiments on the scenario of measuring gender bias of occupational words, according to an English Wikipedia corpus. Our experiments show higher correlations of the measured gender bias with the actual gender bias statistics of the U.S. job market – on two collections and with a variety of word embedding models – using the first-order approach in comparison with the vector similarity-based approaches. The first-order approach also suggests a more severe bias towards female in a few specific occupations than the other approaches.

Introduction

Text data has been widely utilized for studying and monitoring societal phenomena – such as biases and stereotypes – commonly by exploiting co-occurrence statistics of words in text. In these approaches, a societal bias construct (an unobservable abstraction that we aim to characterize) is quantified using measures of words association. A word such as nurse is considered to be stereotypically biased towards the female concept, when a significant imbalance is observed between the associations of nurse to female versus male concept. Each of the concepts is commonly defined by a group of words, referred to as concept-representative words. The focus of the present work is on the computational methods for measuring biases from text corpora – a particularly essential component in various social studies.

The common approach to calculate words associations for bias measurement is by adopting word embedding models trained on text corpora as in preceding studies (Lenton, Sedikides, and Bruder 2009; Hoyle et al. 2019; Zhou et al. 2019; Chang and McKeown 2019; Zhao et al. 2019; Garg et al. 2018; Caliskan, Bryson, and Narayanan 2017; Bolukbasi et al. 2016). In these studies, the associations of words to concepts are measured based on some form of vector similarity, for instance by using the cosine metric. The present study sheds light on and discusses what is captured as bias by these vector similarity-based approaches, and proposes a complementary bias measurement approach based on a smoothed variant of direct (first-order) co-occurrences. Let us first have a closer look at what is measured by vector similarity or more generally by similarity metrics applied to distributional representations.

In distributional representations, two vectors are more similar if the corresponding words both frequently co-occur with a set of context-words (second-order co-occurrence). Figure 1 elaborates this using a toy example. In the example, the association of the word nurse to the female concept, represented by the word she, is calculated using cosine similarity between their vectors. The word vectors in the example are high-dimensional representations, such that each dimension of a vector is defined explicitly with a specific context-word, and each value of the vector represents the first-order co-occurrence relation between the word and the corresponding context-word. We refer to such vectors as explicit representations. As shown, nurse and she are similar since they co-occur with several common context-words, depicted with the blue circles in both vectors.

Let us assume that in this example our objective of gender bias measurement can be formulated as “to quantify the extent to which nurse is perceived as female versus male”. Considering this objective, we observe that the nurse-to-she similarity is influenced by the common context-words such as woman and girl, which are typically considered as good representatives of the female concept (Garg et al. 2018; Bolukbasi et al. 2016). However, this similarity is also affected by several other context-words, such as midwife and matron. According to the literal definitions of these words (as defined in a dictionary), midwife is gender-neutral, and matron is a mixture of the female concept and the concept of “being in charge of medical arrangements”. Considering this, one can argue that such common context-words can introduce partially or completely irrelevant concepts to female into the measured association, and hence into the calculated gender bias.

While we discuss this issue on explicit representations, it is also present in low-dimensional embeddings, although such an explicit dividing of dimensions into groups is not per se possible. Another difference between explicit representations and embeddings is that, since word embeddings are defined in low dimensions, the similarity of embedding vectors does not only capture second-order co-occurrences, but other orders of co-occurrence, such as first-order as well as higher orders.111For instance, Kontostathis and Pottenger (2006) found that Latent Semantic Analysis (Deerwester et al. 1990) can take into account up to fifth-order co-occurrences. We therefore refer to the approaches that use word embeddings for quantifying bias as high-order bias measurements.

We approach the discussed issue in high-order bias measurement by revisiting the utilization of first-order co-occurrences for measuring bias. Our proposed approach, referred to as first-order bias measurement, estimates the association of a concept and a word by averaging the first-order co-occurrences between the word and the concept-representative context-words. The first-order bias measurement introduces an alternative to high-order approach, and has the advantage of only taking into account the context-words which are strongly related to the concept of interest.

First-order co-occurrence of words has been widely used to calculate societal phenomena particularly through counting and weighting words (Monroe, Colaresi, and Quinn 2008; Kirchler 1992; Rekabsaz et al. 2017). Among various metrics, the ones based on Pointwise Mutual Information (PMI) (Church and Hanks 1990) are commonly used to measure words co-occurrence in local contexts. As mentioned by Lenton, Sedikides, and Bruder (2009), a draw back of such count-based co-occurrence metrics is the high sparsity of their resulting vectors, as many related words never appear in the same local context.

We address the issue of sparsity in count-based metrics by proposing two novel explicit representations, created from pre-trained word2vec Skip-Gram (Mikolov et al. 2013), and GloVe (Pennington, Socher, and Manning 2014) models. The proposed explicit variants exploit the word and context embedding vectors of word2vec and GloVe to estimate forms of the co-occurrence relations. Such co-occurrence relations, achieved from the reconstruction of explicit vectors from low-dimensional embeddings, provide smoothed variants of the count-based co-occurrence estimations.

We use the discussed word representations, trained on an English Wikipedia corpus to study the characteristics of the first- and high-order bias measurement approaches. We conduct several experiments on the gender bias of occupations. We first revisit the experiments conducted in previous studies (Garg et al. 2018; Caliskan, Bryson, and Narayanan 2017) on the correlations of the gender bias of some occupations, measured using the discussed methods, to the actual statistics of gender bias in the U.S. job market. We use two collections, provided by Zhao et al. (2018a) and Garg et al. (2018). We observe that, in all studied word representation models and the two collections, the results of our proposed first-order bias measurement shows higher correlations in comparison with the high-order approaches.

Next, we analyze the measured gender bias of around 500 occupations using first- and high-order approaches, observing several cases of the influence of non-relevant context-words on the results of the high-order bias measurement method. Overall, our results suggest the existence of a more severe degree of bias towards female in the underlying corpus, previously undetected by high-order approaches.

Finally, we study how each bias measurement approach reacts to (hypothetical) changes in the corpus, particularly when the corpus moves towards a more balanced representation of genders. To this end, we manipulate the corpus using the Counterfactual Data Augmentation (CDA) method (Zhao et al. 2018a; Lu et al. 2018; Zmigrod et al. 2019), such that the genders are represented in the augmented corpora in a more balanced way. Our observations show that, while both bias measurement methods report a decrease of gender bias, the first-order bias measurement is more sensitive (reacts faster) to the changes in the corpus.

Limitations of the study. Gender is treated in this study as a binary construct, and the definition of gender bias is limited to the disparity between female and male. We acknowledge that this choice neglects the broad meaning of gender, but the decision is necessary for taking practical steps. Our study is also limited to the English language. Our introduced method is however generic and can be applied to other languages as well as other forms of societal biases such as related to race, age and ethnicity.

Outline of the paper. We first discussed related work, followed by the relevant previous methods. Our bias measurement approach is introduced next. Finally, the gender bias experiments are described, and whose results are presented.

Related Work

Various aspects of word embeddings and distributional representations in areas such as social sciences and psychology are studied in previous work. Lenton, Sedikides, and Bruder (2009) discuss the use of Latent Semantic Analysis (Deerwester et al. 1990) for measuring gender bias, highlighting the sparsity issue of the first-order method based on count-based metrics. The present work complements this study by exploring the benefits of a smoothed first-order bias measurement approach as an alternative to previous approaches. In a recent study, Günther, Rinaldi, and Marelli (2019) discuss the applications and common misconceptions of distributional semantic models in psychology.

Several pieces of work exploit word embeddings to study societal aspects. Garg et al. (2018) investigate the changes in gender- and race-related stereotypes over decades using historical text data. Caliskan, Bryson, and Narayanan (2017) and more recently Chang and McKeown (2019) study the patterns of language use, indicating accurate imprints of historical biases. Bolukbasi et al. (2016) show the reflection of gender stereotypes in word analogies derived from word embeddings. Zhou et al. (2019) propose methods to measure societal biases in languages with grammatical gender. Our work directly contributes to these studies by proposing a more accurate approach for measuring bias in corpora. In this line, Hoyle et al. (2019) propose a method to measure the differences between descriptions of men and women. In contrast to our bag-of-words-based method, they measure bias using a parsed corpus.

Gender bias is also studied in various downstream tasks, such as sentiment analysis (Kiritchenko and Mohammad 2018), visual semantic role labeling (Zhao et al. 2017), coreference resolution (Zhao et al. 2018a; Rudinger et al. 2018), information retrieval (Rekabsaz and Schedl 2020) text, and classification (Dixon et al. 2018; Barrett et al. 2019; Elazar and Goldberg 2018; De-Arteaga et al. 2019), as well as in language generation models (Sheng et al. 2019).

Mitigating the existence of societal biases in data and models has been the topic of several studies. Some pieces of work propose the debiasing of word embeddings by identifying and removing gender subspace (Ethayarajh, Duvenaud, and Hirst 2019b; Bolukbasi et al. 2016; Kaneko and Bollegala 2019; Zhao et al. 2018b). As pointed out by Gonen and Goldberg (2019), these methods successfully remove the explicit bias, while the implicit bias, examined through the ability of a classifier or a clustering algorithm to retrieve the gender of vectors after debiasing, still remains. Recently, Lauscher et al. (2020) propose a general framework for mitigating both forms of bias.

Another approach to address bias, and related to this paper, is bias reduction in corpus via data augmentation. Counterfactual Data Augmentation (CDA) is a common method, which, in its basic form, extends a corpus by adding new sentences, achieved from swapping the indicative words of the concept of interest in the corpus. Zhao et al. (2018a) use CDA in the context of coreference resolution, Lu et al. (2018) show the effectiveness of combination of CDA and embedding debiasing, Zmigrod et al. (2019) and later Maudslay et al. (2019) extend CDA to address bias in morphologically rich languages, as well as names. In this work, we use the basic CDA method to study the reaction of the bias measurement methods to the changes in the corpus.

High-Order Bias Measurements

We define bias as the discrepancy between the associations of a concept and its counterpart concept to a word. High-order bias measurement methods use vector similarity to quantify the associations. We define the concept $Z$ (and similarly its counterpart concept $Z^{\prime}$ ) as a set, containing a group of representative words. In general, three approaches to high-order bias measurement are proposed in the literature, explained in what follows.

Directional: In this method, first a matrix of directional vectors ${\bm{D}}$ is created using a set of word pairs ${{\mathbb{P}}_{Z,Z^{\prime}}=\{(x,x^{\prime})|x\in Z,x^{\prime}\in Z^{\prime}\}}$ , such that ${{\bm{D}}=\{{\bm{v}}_{x}-{\bm{v}}_{x^{\prime}}|(x,x^{\prime})\in{\mathbb{P}}_{Z,Z^{\prime}}\}}$ , where ${\bm{v}}_{x}$ is the vector of the word $x$ . Using ${\bm{D}}$ , the first principle component of the directional vectors is then calculated, which we refer to as ${\bm{v}}_{d}$ .

Bolukbasi et al. (2016) define the bias of the word $w$ using the cosine similarity of ${\bm{v}}_{w}$ and ${\bm{v}}_{d}$ . Ethayarajh, Duvenaud, and Hirst (2019b) propose to normalize only ${\bm{v}}_{d}$ to avoid the overestimation of the association degree, resulting in the following bias measurement:

[TABLE]

where ${\psi}(w)$ denotes the degree of bias of the word $w$ , and its sign defines whether the word is biased towards the concept $Z$ or $Z^{\prime}$ . This definition is applicable to all the bias measurements, discussed through the paper.

Centroid: This method, used in several studies such as Garg et al. (2018) and Dev and Phillips (2019), first defines the representative vector ${\bm{v}}_{Z}$ as the mean of the embeddings of the representative words:

[TABLE]

The association of the concept $Z$ to the word $w$ is then defined using the cosine metric of ${\bm{v}}_{Z}$ and ${\bm{v}}_{w}$ . Finally, the bias is calculated as follows:

[TABLE]

$\textsc{Average}_{\textsc{High}}$ : Caliskan, Bryson, and Narayanan (2017) introduce Word Embedding Association Test (WEAT), a statistical test to examine the existence of bias using vector similarity. Ethayarajh, Duvenaud, and Hirst (2019b) criticizes WEAT, showing that the conclusion of the test can be manipulated by swapping gender-related concept words. We therefore only study the method used by Caliskan, Bryson, and Narayanan (2017) to measure the associations of words to concepts. This method calculates the average of the cosine similarities between the vector of the target word and the vectors of the concept words. We refer to this method as $\textsc{Average}_{\textsc{High}}$ , formulated as follows:

[TABLE]

The bias using $\textsc{Average}_{\textsc{High}}$ is calculated as follows:

[TABLE]

Novel Bias Measurement

As discussed in Introduction, the first-order bias measurement requires the estimation of co-occurrence relations, which we provide using explicit representations. A well-known method to create explicit representations is based on the Point Mutual Information (PMI) metric. The PMI representation uses the count-based probabilities, where the co-occurrence relation between a word and a context-word in the PMI representation is calculated by ${\log\left(p(w,c)/p(w)p(c)\right)}$ . Positive PMI (PPMI) is a commonly-used variation, where negative values are replaced with zero. Levy and Goldberg (2014) draw the relation between word2vec SkipGram (SG) embeddings and PMI representations, showing that SG can be seen as a factorization of the PMI matrix shifted by $\log k$ . Based on this idea, they propose the Shifted Positive PMI (SPPMI) representation by subtracting $\log k$ from PMI vector representations and setting the negative values to zero.

In the following of this section, we first explain our approach to creating the explicit variations of the SG and GloVe vectors, referred to as explicit Skip-Gram (eSG) and explicit GloVe (eGloVe). Our approach reconstructs explicit representations from embedding vectors, and is related to previous studies such as Ethayarajh, Duvenaud, and Hirst (2019a), and Levy and Goldberg (2014). We then describe our first-order bias measurement method, defined based on any explicit vector.

Smoothed Explicit Representations

explicit Skip-Gram (eSG)

The original SG model consists of two parameter matrices: word ( ${\bm{V}}$ ) and context ( ${\bm{U}}$ ) matrices, both of size $\left|{\mathbb{V}}\right|\times d$ , where ${\mathbb{V}}$ is the set of words in the collection and $d$ is the embedding dimensionality. Given the word $c$ , appearing in a context of word $w$ , the model calculates $p(y=1|w,c)=\sigma({\bm{v}}_{w}{\bm{u}}_{c}^{\top})$ , where ${\bm{v}}_{w}$ is the vector representation of $w$ , ${\bm{u}}_{c}$ context-vector of $c$ , and $\sigma$ denotes the sigmoid function. The SG model is optimized by maximizing the difference between $p(y=1|w,c)$ and $p(y=1|w,\check{c})$ for $k$ negative samples $\check{c}$ , randomly drawn from a noisy distribution $\mathcal{N}$ . The $\mathcal{N}$ distribution is set to the unigram distribution of the corpus, while downsampled by the context distribution smoothing parameter.

The $p(y=1|w,c)$ term in the SG model measures the probability that the co-occurrence of two words $w$ and $c$ comes from the genuine co-occurrence distribution, derived from the training corpus. The model uses this probability to learn the embedding vectors, by separating these genuine co-occurrence relations from the sampled negative ones. We therefore use this estimation of the co-occurrence relations to define the vectors of the eSG representation, resulting to the following definition of eSG vector:

[TABLE]

where $e_{w:c}$ denotes the value of the corresponding dimension of the vector of $w$ to the context-word $c$ , and ${\bm{e}}_{w}$ in $\left|{\mathbb{V}}\right|$ dimensions is the explicit variation of the SG vector of word $w$ .222An alternative formulation of eSG is to normalize it by dividing its values with the square root of the expectations of the co-occurrence relations for each word and context-word, namely $\mathop{{}\mathbb{E}}_{c^{\prime}\sim\mathcal{N}}{\sigma({\bm{v}}_{w}{\bm{u}}_{c^{\prime}}^{\top})}$ and $\mathop{{}\mathbb{E}}_{w^{\prime}\sim\mathcal{N}}{\sigma({\bm{v}}_{w^{\prime}}{\bm{u}}_{c}^{\top})}$ . We study this variation in pilot experiments, observing similar results to the introduced variation. We therefore stay with the less complex formulation (Eq. 5).

We should note that the eSG representation is considerably different from the shifted PMI representation (Levy and Goldberg 2014): shifted PMI assumes very high embedding dimensions during the model training, while eSG draws the co-occurrence relations after the model is trained on low-dimensional embeddings. In fact, as SG is an implicit factorization of shifted PMI (Levy and Goldberg 2014; Ethayarajh, Duvenaud, and Hirst 2019a), eSG provides a smoothed variation of shifted PMI.

explicit GloVe (eGloVe)

The GloVe model first defines an explicit matrix (size $\left|{\mathbb{V}}\right|\times\left|{\mathbb{V}}\right|$ ), where the corresponding co-occurrence value of each word and context-word is set to ${\log p(w|c)}$ . This log probability is calculated based on the number of co-occurrences (denoted by $\#\langle,\rangle$ ), such that ${\log p(w|c)=\log\#\langle w,c\rangle-\log\#\langle\cdot,c\rangle}$ . We refer to this sparse explicit matrix as initGlove.

The GloVe model then implicitly factorizes the initGlove matrix, achieving two low-dimensional matrices of size $\left|{\mathbb{V}}\right|\times d$ , as well as two bias vectors of size $\left|{\mathbb{V}}\right|$ , where one assign a bias value to each word, and the other to each context-word. Using the same notation as SG, the factorization is done such that the dot products of the vectors of the matrices ${\bm{V}}$ and ${\bm{U}}$ plus the corresponding bias values estimate the logarithm of the co-occurrence values, as defined in the following:

[TABLE]

where $b_{w}$ and $\tilde{b}_{c}$ denote the bias value of word $w$ and context-word $c$ , respectively.

Similar to eSG, the eGloVe representation estimates the co-occurrence relations using the word and context vectors, after training the GloVe model. Considering Eq. 6, we define the co-occurrence relations of eGloVe as the dot product of the word and context vectors,333As in eSG, eGloVe does not need extra normalization, since the bias terms $b_{w}$ and $\tilde{b}_{c}$ in Eq. 6, learned during training, act as normalizers to the co-occurrence estimation ${\log\#\langle w,c\rangle}$ . shown as follows:

[TABLE]

The eGloVe representation in fact reconstructs initGlove, providing a smoothed variation.

First-Order Bias Measurement

The main difference between the first-order bias measurement method and the high-order approaches is in the estimation of the associations of a word to the concepts. We define our bias measurement method based on the $\textsc{Average}_{\textsc{High}}$ method, by replacing the cosine metric with co-occurrence relations, and therefore refer to it as $\textsc{Average}_{\textsc{First}}$ . Given an explicit vector denoted as ${\bm{e}}$ , $\textsc{Average}_{\textsc{First}}$ is defined as the mean of the co-occurrence values of $w$ with the representative words $Z$ , formulated as follows:

[TABLE]

The bias toward $Z$ is then defined as the differences between the associations of $w$ to $Z$ and $Z^{\prime}$ :

[TABLE]

As shown, $\textsc{Average}_{\textsc{First}}$ only considers the context-words related to the $Z$ and $Z^{\prime}$ concepts. This avoids the influence of other non-relevant concepts as in the high-order bias measurements.

Let us review the required calculations for $\textsc{Average}_{\textsc{First}}(w,Z)$ when using the introduced smoothed explicit representations, namely eSG or eGloVe. In this setting, the main computation of the bias measure is the dot products of the vector of $w$ to the context-vectors of the words in $Z$ . We should note that the computational complexity of this calculation is the same as the one of $\textsc{Average}_{\textsc{High}}(w,Z)$ on SG/GloVe. In fact, considering the dot product of two embedding vectors as computation unit, the complexity of both bias measurement methods is $\mathcal{O}(\left|Z\right|)$ .

Finally, a practical consideration in calculating eSG and eGloVe is that this computation requires the context vectors in addition to word vectors. These context vectors are commonly stored in the libraries used for SG and GloVe embeddings alongside the word vectors, mainly for the purpose of continuous training. In eSG/eGloVe, these context vectors are exploited to estimate smoothed first-order relations.

Gender Bias Experiment Design

In this section, we explain the design of our experiment and the resources. Our source code together with all resources, including the lists of occupational and gender-representative words are publicly available.444https://github.com/navid-rekabsaz/SmoothedFirstOrderBias.

Word Representations. We conduct our experiments on the PMI, PPMI, SPPMI, SG, eSG, Glove, and eGlove representations. In addition, we create the low-dimensional vectors of the the PMI-based representations using Singular Value Decomposition (SVD), referred to as PMI-SVD, PPMI-SVD, and SPPMI-SVD. These word representation models are created on the English Wikipedia corpus of August 2017. We project all characters to lower case, and remove numbers and punctuation marks. For all models, we use the window size of 5, and filter the words with frequencies lower than 200, resulting in 197,549 unique words. The number of dimensions of the embeddings are set to 300. The rest of the parameters are set using the default parameter setting of the word2vec Skip-Gram model in the Gensim library (Rehurek and Sojka 2010), and the GloVe model in the provided tool by its authors. As suggested by (Levy, Goldberg, and Dagan 2015), we apply subsampling and context distribution smoothing on all PMI-based models with the same parameter values as the SG model.

Gender-Representative Words. In all bias measurements, the concepts $Z$ and $Z^{\prime}$ are assumed as female and male, respectively. Therefore, a positive bias value indicates the inclination towards female, and a negative one towards male. The concepts are defined using two sets, each with 28 words, containing words like she, her, woman for the female and he, his, man for the male concept. These sets are taken from previous studies (Bolukbasi et al. 2016; Garg et al. 2018). From the same sets, we form the gender pairs, used in the CDA method, and listed in Supplemental Material.

Occupational Words. We provide a list of 496 occupations, among which 17 are female-specific (e.g. congresswoman), 9 male-specific (e.g. congressman), and the rest are gender-neutral (e.g. nurse and dancer). The set and assigned genders are listed in Supplemental Material.

Bibliography40

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Barrett et al. (2019) Barrett, M.; Kementchedjhieva, Y.; Elazar, Y.; Elliott, D.; and Søgaard, A. 2019. Adversarial Removal of Demographic Attributes Revisited. In Proceedings of the Conference on Empirical Methods in Natural Language Processing .
2Bolukbasi et al. (2016) Bolukbasi, T.; Chang, K.-W.; Zou, J. Y.; Saligrama, V.; and Kalai, A. T. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Proceedings of the 30th International Conference on Neural Information Processing Systems .
3Caliskan, Bryson, and Narayanan (2017) Caliskan, A.; Bryson, J. J.; and Narayanan, A. 2017. Semantics derived automatically from language corpora contain human-like biases. Science .
4Chang and Mc Keown (2019) Chang, S.; and Mc Keown, K. 2019. Automatically Inferring Gender Associations from Language. In Proceedings of the Conference on Empirical Methods in Natural Language Processing .
5Church and Hanks (1990) Church, K. W.; and Hanks, P. 1990. Word association norms, mutual information, and lexicography. Computational linguistics .
6De-Arteaga et al. (2019) De-Arteaga, M.; Romanov, A.; Wallach, H.; Chayes, J.; Borgs, C.; Chouldechova, A.; Geyik, S.; Kenthapadi, K.; and Kalai, A. T. 2019. Bias in bios: A case study of semantic representation bias in a high-stakes setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency . ACM.
7Deerwester et al. (1990) Deerwester, S.; Dumais, S. T.; Furnas, G. W.; Landauer, T. K.; and Harshman, R. 1990. Indexing by latent semantic analysis. Journal of the American society for information science .
8Dev and Phillips (2019) Dev, S.; and Phillips, J. 2019. Attenuating Bias in Word vectors. In The International Conference on Artificial Intelligence and Statistics .