How does a recent gender norms scale perform? Exploratory factor analyses among adolescents in Ethiopia and Bangladesh

Anita Alaze; John Grosser; Oliver Razum; Céline Miani; Ifunanya Agu; Ifunanya Agu

PMC · DOI:10.1371/journal.pgph.0005190·October 16, 2025

How does a recent gender norms scale perform? Exploratory factor analyses among adolescents in Ethiopia and Bangladesh

Anita Alaze, John Grosser, Oliver Razum, Céline Miani, Ifunanya Agu, Ifunanya Agu

PDF

Open Access

TL;DR

This paper evaluates a gender norms scale among adolescents in Ethiopia and Bangladesh, finding that a five-domain structure works better than a two-factor model in these contexts.

Contribution

The study adapts and validates a gender norms scale for adolescents in Ethiopia and Bangladesh, proposing a five-domain structure over the original two-factor model.

Findings

01

The five-domain structure was more suitable than the two-factor individual-community distinction in both Ethiopia and Bangladesh.

02

Only 17 and 15 of the original 30 items were retained in Ethiopia and Bangladesh, respectively.

03

Two of the factors in the refined structure included only two variables each, suggesting limitations in the scale.

Abstract

Inequitable gender norms shape adolescents’ perceptions and behaviours, increasing the risk for adverse health outcomes as adults. However, there is a lack of reliable scales to measure these norms. The Gender and Adolescence: Global Evidence (GAGE) project proposes a scale for adolescents aged 10–19 years considered vulnerable, (i) distinguishing between individual-level gender attitudes and community-level gender norms (2 factors), and (ii) categorising items into five domains (e.g., education; 5 factors). As part of validating this scale, we analyse the two- and five-factor structure using GAGE datasets from Ethiopia and Bangladesh. We performed Explorative Factor Analyses (EFA) using Principal Axis Factoring and oblique rotation. We tested sampling adequacy using Bartlett’s test of sphericity and the Kaiser-Meyer Olkin measure. In the EFA, we tested the two-factor structure and…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Genes1

GEM

Proteins1

Species1

Homo sapiens(human · species)

Chemicals1

PGPH-D-25-00488

Diseases7

intimate partner violence and disease Self-harm body dissatisfaction eating or mental disorders sexual/physical abuse disabilities

Figures8

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGender Roles and Identity Studies · Sex and Gender in Healthcare · Adolescent Sexual and Reproductive Health

Full text

1 Introduction

Gender inequity hinders the realisation of human rights and health for all. While Sustainable Development Goal (SDG) 5 drives global efforts to reduce gender inequities, persisting gender inequity has been identified as one of the main obstacles hindering further progress on the other SDGs [1]. Among the biggest barriers to gender inequity are societal gender norms [1]. Gender norms are defined as the “spoken and unspoken rules of societies about the acceptable behaviours of girls and boys […] how they should act, look, and even think or feel” [2].

Gender norms have gained recognition as a public health priority, which increased scientific efforts to adequately measure gender norms [3]. While qualitative methods are typically used to i) gain deeper insights about attitudes toward gender norms, and ii) to inform scale development [4], quantitative gender norms scales serve to measure gender norms as an outcome and as an important factor to behaviour change and health outcomes [3]. They are also used to identify and advance interventions that effectively tackle inequitable gender norms [5]. However, in the translation of qualitative findings to quantitative scales, theoretical constructs (e.g., social norms theory) have so far not been adequately incorporated [4,6].

Moreover, a key research gap around the measurement of gender norms concerns their conceptualisation and validity [7,8]. Various quantitative surveys, such as the World Value Survey and the International Social Survey Programme [8], contain unintentional gender bias in question phrasing and translation, mostly rely on proxies for gender norms, such as social aggregates for individual behaviours and attitudes, and do not contain missing information on gender identity and individuals’ gendered experiences (e.g., socially constructed gender roles, gender practices and relationship dynamics) [9]. In addition, many different conceptualisations of gender norms are used in the gender norms scales. This can be seen from the different terminologies used in gender norms scales such as the Gender Attitudes Scale (GAS) by Lundgren et al. [10], the International Men and Gender Equality Survey (IMAGES) by Levtov et al., [11], and the G-NORM scale by Sedlander et al. [6]. The conceptualisation is particularly inconsistent when it comes to distinguishing gender norms from related concepts such as gender roles or gender stereotypes. The former refers to “beliefs about appropriate roles for males and females regarding the division of paid labor, homework, and childcare” (Davis & Greenstein, 2009) and the latter “are widely held, generalised assumptions regarding common traits (including strengths and weaknesses), based on group categorisation” [12,13].

Adolescence has been recognised as a window of opportunity to change inequitable gender norms. It is a critical and intensified period of life [4] in which adolescents’ perspectives towards gender norms are shaped. Gender norms also have a notable effect on morbidity and mortality during adolescence. For example, body dissatisfaction as part of gender norm concepts can be associated with gender-related motives of self-harm [14–16], such as exposure to bullying, victimisation, and sexual/physical abuse [14]. Self-harm, in turn, is associated with an increased risk for suicide [17]. Gender norms influence the health trajectories of adolescents throughout a lifetime [18]. Adolescents who adopt gender inequitable attitudes and roles are at greater risk of experiencing adverse gender-related health outcomes as adults [19]. Examples of this include body dissatisfaction, which can lead to eating or mental disorders [16]. Beliefs in male superiority/entitlement or the justification of violence in the partnership can lead to the perpetration of intimate partner violence and the reinforcement of power imbalances in society [19]. This makes adolescence an influential period in which health-related outcomes are predetermined and individual attitudes towards gender norms are formed.

Although some gender norms scales have been specifically developed for or adapted to adolescents, such as the Gender Equitable Men (GEM) Scale by Vu et al. [20], Attitudes Towards Women Scale (for Adolescents) by Galambos et al. [21], and the Adolescent Femininity Ideology Scale by Tolman and Porche [22], the age group of adolescents has so far been greatly overlooked in research [18].

A conceptual challenge with those scales arises through the fact that particularly younger adolescents (10–14 years) find it difficult to answer the provided questions. This is because many of them may have not yet engaged in sexual relations or have difficulties imagining abstract situations [4]. For example, the GEM scale was originally developed for individuals aged 15–24 years and covers the four domains of violence, reproductive health and disease prevention, sexuality, and domestic chores and daily life [4,23]. A study in Uganda showed that an adaptation which includes only 16 of the original 24 items across the four domains is needed when validating the GEM scale for younger adolescents [4,20]. Although the Gender Norm Instrument (GNI) of the Global Early Adolescent Study [24], and the Male Role Norms Inventory (MNI) – Adolescent, revised by Levant et al. [25] are gender norms scales developed specifically for early adolescence, further research is needed. In particular, because interventions to tackle inequitable gender norms may be more effective at a younger age [26].

Another concern is that most of those scales have been developed in the Global North. This highlights a research gap for adolescents in Lower and Middle-Income (LMI) countries with different cultural and socioeconomic contexts. Moreover, this not only influences who and what is measured, but also risks perpetuating neocolonial biases and assumptions [9,27]. Additionally, the validity of gender norms scales among marginalized or minoritized adolescents is even less understood [7]. This is because socio-structural disadvantages, such as social class, race, ethnicity, migrant or refugee status, (dis)ability, religion, and area of residence, are areas affected by gender norms [28]. Thus, evidence on the exact pathways of gender norms to health is scarce and the understanding of how adolescents perceive gender norms in cross-cultural settings is limited [4].

The Gender and Adolescence: Global Evidence (GAGE) project as the largest longitudinal study on adolescents aims to address these research gaps in adolescents considered vulnerable in the Global South (e.g., adolescents with disabilities, out of school youth, refugees, married under age 18 etc.) and in the conceptualisation of gender norms. The GAGE project has developed a gender norms scale which includes the aforementioned social norms theory and draws on items from multiple scales: the GAS, the GEM, the IMAGES, and the GNI scale. Some items in the GAGE gender norms scale were also newly introduced. They stem from previously conducted qualitative interviews in the GAGE project [29]. Further details of the individual items and the scales from which they were drawn can be found in Table A.5 in the publication by Baird et al. [29].

As part of the GAGE project, Baird et al. (2019) formed two separate gender norm scores from the GAGE gender norms scale, one for individual-level gender attitudes and the other for perceived community-level gender norms. In their publication, they used these scores to predict physical and mental health in Bangladesh and Ethiopia [29]. Individual-level gender attitudes measure what the adolescent thinks about the gendered attitude while perceived community-level gender norms measure descriptive norms as to “what the [adolescent] believes others do” and injunctive norms as to “what the [adolescent] believes others think that s/he should do” [29,30]. Importantly, the gender norms items in the GAGE gender norms scale are further divided into five domains: [1] education; [2] time use; [3] financial inclusion and economic empowerment; [4] relationships and marriage; and [5] sexual and reproductive health domain [29].

Our analysis draws on the paper of Baird et al. [29] in which the GAGE gender norms scale was introduced. This scale has not yet been adequately statistically validated and makes implicit claims about the factor structure of gender norms that have not been empirically verified. Our aim in this study is therefore to explore the factor structure of the GAGE gender norms scale using Exploratory Factor Analysis (EFA). More specifically, we investigate possible factor structures with two or five factors, which correspond to (i) the distinction between individual-level gender attitudes and perceived community-level gender norms, and (ii) the categorisation of the gender norms items into the five domains present in the GAGE data, respectively. Thus, we compute two- and five-factor structures in our EFA, and compare the results between two countries.

2 Materials and methods

2.1 Ethical approval

Our study did not require ethical approval, as it is based on the analysis of publicly available secondary panel data that does not contain personal identifiers.

2.2 Data

Our statistical analysis draws on quantitative data from the GAGE project, a mixed-methods longitudinal research and evaluation programme. GAGE follows 18,000 adolescents aged 10–19 years as well as other community members over 10 years from 2015 to 2024 [31]. The project aims to discover ‘what works’ [32] for adolescents considered the most vulnerable. The quantitative survey includes adolescents with disabilities, adolescents who were married before the age of 18, refugees, adolescent mothers and out-of-school youth in the six LMI countries included in the project to address the paucity of data on adolescents considered vulnerable [32]. Besides Rwanda, Nepal, Jordan and Lebanon, the survey also includes Ethiopia and Bangladesh [29,32].

Following Baird et al.’s analysis [29], our study draws on the baseline findings collected by the GAGE project in Ethiopia and Bangladesh in 2017 and 2018. Data collection took place in six locations in Ethiopia and in three locations in Bangladesh. While in Ethiopia and one Bangladesh location, a household census was used to identify potential participants, a school-based census was employed in the other two Bangladesh locations [29]. Further details on data collection procedures are described in Baird et al. [29].

The data on which the results of our study are based are third-party data that are publicly archived by the data provider UK data service. They are available free of charge following registration with the data provider.

2.3 Measures

The GAGE gender norms scale in the Ethiopia and Bangladesh datasets used by Baird et al. [29] consists of individual-level gender attitudes and community-level gender norms items, both of which are embedded in five content-related domains (education; time use; financial inclusion and economic empowerment; relationships and marriage; sexual and reproductive health). An example for the education domain is: “Girls should be sent to school only if they are not needed to help at home”. “Girls and boys should share household tasks equally” is an example for the time use domain and “women who participate in politics or leadership positions cannot also be good wives or mothers” is an example for the financial inclusion and economic empowerment domain. The relationships and marriage domain is captured, for example, by “adolescent girls should marry before the age of 18 years (legal age)” and the sexual and reproductive health domain is captured by questions such as “families should control their daughters’ behaviors more than their sons”. S1 Table in the supplementary material provides an overview of the GAGE gender noms items and their meaning.

The three items in the sexual and reproductive health domain and the one item in the marriage and relationships domain in italics could not be included as they were only answered by older adolescents. The gender norms items are three-level variables with ‘1’ indicating agreement, ‘2’ indicating partial agreement and ‘3’ indicating disagreement with the statement. Following Baird et al., we recoded all items with a reverse question (cr_edu_boysfeelings, cr_fin_girlchance, cr_fin_impwsav, cr_fin_nework, cr_fin_eework, cr_tu_statements1, cr_mar_girlfriend, cr_mar_boyfriend, cr_mar_waitedu, cr_srh_proudbod) to ensure that agreement suggests a gender-unequal response.

2.4 Exploratory Factor Analysis

We chose EFA to investigate the underlying factor structures of the GAGE gender norms scale for Ethiopia and Bangladesh. We used Principal Axis Factoring and oblimin rotation (an oblique factor rotation method). Oblique rotation allows non-zero correlation between factors [33], which fits our likely highly correlated gender norms items. We used Bartlett’s test of sphericity (test for homoscedasticity based on the Chi-squared distribution) and the Kaiser-Meyer-Olkin (KMO) test (measure of sampling adequacy) to assess the suitability of the datasets for EFA. Datasets are suitable, when the null hypothesis for the first test is rejected and the values for the second test are above 0.5 [34]. We conducted all statistical analyses using the software R (version 4.4.1, R Core Team, 2024).

In their analysis, Baird et al. form two separate gender norm scores (one for individual-level gender attitudes, the other for perceived community-level gender norms) and use these scores as predictors in regression models for physical and mental health [29]. Although they do not say so explicitly, this approach seems to imply a supposed two-factor (individual- vs. community-level) structure in the gender norm data.

To explore the validity of this supposed two-factor structure, we first calculated a two-factor solution for both datasets. Thereafter, we computed a five-factor solution (in accordance with the five domains present in the GAGE data). After computing the initial two- and five-factor solutions, we further refined the five-factor solution for both datasets. In particular, we excluded variables that failed to load onto a factor or exhibited cross-loadings. We iteratively refined the model until all remaining variables loaded onto a factor. We then applied cutoffs, removing variables with communalities below 0.2 [35,36] and factor loadings below 0.3 [34,35]. We repeated this process until all variables loaded onto a factor and did not produce cross-loadings.

3 Results

3.1 Sample characteristics

The Ethiopian sample contained 6,985 and the Bangladeshi sample 2,576 adolescents. After restricting the sample to those adolescents who answered all gender norm questions, the datasets comprise 6,183 individuals for Ethiopia (2,767 boys; 3,416 girls) and 2,245 individuals for Bangladesh (1,104 boys; 1,141 girls). The descriptive statistics of the gender norms items in the two datasets are presented in Table 1.

Table 1: Descriptive statistics of the GAGE gender norms items in the Ethiopia and Bangladesh datasets.

3.2 Exploratory Factor Analysis

Bartlett’s test of sphericity yielded values of χ²(435) = 46514.04, p < 0.001 for the Ethiopia dataset and χ²(435) = 13177.59, p < 0.001 for the Bangladesh dataset, indicating that the inter-item correlations were sufficiently large for EFA in both datasets [34].

The overall KMO of the Ethiopia and Bangladesh datasets was 0.77 and 0.74, respectively, while the KMO of the individual variables ranged from 0.52 to 0.92 and from 0.55 to 0.86, respectively. With values above 0.5, the KMOs suggest adequate samples for conducting an EFA [34]. In the following, we describe the initial two- and five-factor solutions as well as the factor models of the refined five-factor solutions for Ethiopia and Bangladesh and compare the results between datasets.

3.2.1 Two-factor solution.

The two-factor EFA for the Ethiopia dataset is shown in Table 2. Of the 30 variables, 14 loaded onto the first factor (threshold 0.30), while only three variables loaded onto the second factor. Of the former, four variables loaded strongly onto the first factor (threshold 0.50), while all three of the latter loaded strongly onto the second factor. Thirteen variables loaded onto neither factor.

Table 2: Two-factor solution for the Ethiopia dataset (N = 6,183).

Notably, the loading patterns do not correspond to the expected two factors (perceived community-level gender norms and individual-level gender attitudes). In particular, the second factor contains only variables on what was defined as sexual and reproductive health, including two community-level variables and one individual-level variable. The first factor also contains both individual- and community-level variables, particularly on education and time use.

Furthermore, the two-factor solution does not fit the data well. Beyond the high number of variables that do not load onto any factor, the communalities of the variables are low across the board. In fact, only two variables have a communality of more than 0.50. A substantial number of variables has a very high uniqueness (higher than 0.80), indicating that they contribute little to the factor structure. Only two variables have uniqueness values below 0.60. The two factors explained 11.62% and 6.93% of the variance, respectively, for a total of only 18.55%.

In the Bangladesh dataset, seven variables loaded onto the first factor, three of them strongly. Another seven variables loaded onto the second factor, only a single one of them strongly. The remaining sixteen variables loaded onto neither factor. The first factor contains most of the education variables, including some (but not all) individual- and community-level variables. The second factor contains most of the variables (both individual- and community-level) on sexual and reproductive health, along with some time-use variables and one education variable. The two-factor solution for the Bangladesh dataset is shown in Table 3.

Table 3: Two-factor solution for the Bangladesh dataset (N = 2,245).

Once again, therefore, the loadings do not correspond well to the individual-community distinction. Furthermore, as in the Ethiopia dataset, the factor solution does not adequately fit the data. Communalities are again low, with no variable having a communality above 0.43. The cumulative proportion of explained variance is only 15.48%, with 8.53% of variance explained by the first and 6.96% by the second factor.

In summary, the two-factor solution is unsatisfactory in both datasets. This indicates that it is not a reflection of the data structure to divide the variables into a community-level and an individual-level factor. Instead, the fact that both datasets yield a factor that contains only or mostly variables from a single domain indicates that a five-factor solution based on these domains may be more appropriate.

3.2.2 Initial five-factor solution.

The five-factor solution for the Ethiopia dataset, shown in Table 4, generally supports this notion. Each of the five factors corresponds roughly to one of the five domains. The first factor is strongly loaded onto five out of nine education variables, with another education variable loading weakly. In contrast, only one non-education variable loads (weakly) onto the first factor. The second factor contains all but one of the time-use variables, with most of these variables loading strongly. Two non-time-use variables also loaded onto the second factor, though they only loaded weakly. Three of four sexual and reproductive health variables loaded strongly onto the third factor. The fourth factor includes mostly financial and economic empowerment variables, while the fifth factor includes only two variables, both of which belong to the marriage and relationship domain.

Table 4: Initial five-factor solution for the Ethiopia dataset (N = 6,183).

Although the communalities in this factor solution were somewhat higher than in the two-factor solution, a number of variables with very low communalities remain. Furthermore, although the number of unassigned variables is lower than in the two-factor solution, six variables remain that do not load onto any factor. There now is a higher number of variables with uniqueness values below 0.6, but 13 variables remain with values around 0.8 and higher. And while the cumulative proportion of variance explained is higher than for either two-factor solution at 33.82%, this proportion is still below the desired minimum of 50% in social sciences [37].

In the Bangladesh dataset, a similar picture emerges, with the five factors consisting mainly of education, sexual and reproductive health, marriage and relationship, time-use, and financial and economic empowerment variables, respectively. Ten variables did not load onto any factor. The five-factor solution for the Bangladesh dataset is shown in Table 5. A number of variables with very low communalities remain; 12 variables had a uniqueness above 0.8 and only eight variables had a uniqueness below 0.6. The cumulative proportion of variance explained was only 29.84%.

Table 5: Initial five-factor solution for the Bangladesh dataset (N = 2,245).

These results support the conclusion that dividing the variables by domain is more appropriate than dividing them into two factors with individual-level and community-level variables. However, the five-factor solutions remain far from perfect in both datasets, exhibiting low communalities, non-loading variables and low proportions of explained variance.

These issues could indicate that a number of factors other than two or five may provide a better factor solution. However, they may also reflect issues with the original data, and the need to reconsider the wording or domain assignment of some questions. Therefore, we next investigated whether removing problematic items improves the five-factor solutions in the Ethiopia and Bangladesh datasets.

3.2.3 Refined five-factor solution.

To refine the initial five-factor solution for Ethiopia, we removed six variables that failed to load onto any of the five factors (cr_fin_impwsav, cr_edu_raisingvoice, cr_mar_waitedu, cr_edu_eegirlsout, cr_edu_boysfeelings, cr_srh_proudbod). The exclusion of these items slightly reduced the KMO from 0.77 to 0.75. Thereafter, we removed another three variables that did not load onto any factor in the reduced set of items (cr_tu_statements1, cr_mar_eemarryage, cr_mar_nemarryage). This resulted in all variables loading onto at least one of the five factors. Further exclusion of items did not result in any additional change in the KMO. Finally, we excluded four more variables due to low communalities (<0.2) and factor loadings (<0.3) (cr_fin_notgood, cr_fin_girlchance, cr_edu_culture, cr_tu_statements4), resulting in a KMO of 0.71. The final set of 17 variables is shown in Table 6.

Table 6: Refined five-factor solution for the Ethiopia dataset (N = 6,183).

The refined five-factor solution shows that each factor corresponds to exactly one of the five domains. The first factor contains five education variables with four of them loading strongly onto the factor. The second factor is strongly loaded onto by three sexual and reproductive health variables while the third factor is loaded onto by five time-use variables, with most of them loading strongly. The fourth and fifth factors only include two variables each. The two marriage and relationship variables loaded strongly onto the fourth and the two financial and economic empowerment variables loaded strongly onto the fifth factor. As a result, each of the five domains in the GAGE gender norms scale loads onto one of the five factors.

Although some communalities remain relatively low in the refined five-factor solution, the lowest value is now 0.23. While six variables have uniqueness values higher than 0.6, none exceed 0.8. The cumulative proportion of variance explained with over 51% is noticeably higher than in the initial five-factor solution.

The results in the Bangladesh dataset, displayed in Table 7, show a similar picture. After removing ten variables that failed to load onto any of the five factors (cr_edu_raisingvoice, cr_edu_boysfeelings, cr_tu_statements1, cr_fin_notgood, cr_fin_impwsav, cr_mar_waitedu, cr_srh_proudbod, cr_edu_culture, cr_tu_statements4, cr_tu_statements6), all variables loaded onto one of the five factors.Thereafter, we further excluded five variables with low communalities and factor loadings (cr_edu_boysch, cr_fin_girlchance, cr_edu_eegirlsout, cr_mar_eemarryage, cr_mar_nemarryage). This process resulted in a final set of 15 variables. The KMO decreased from 0.74 and 0.72 to 0.69, which remains an acceptable, but not ideal value to conduct EFAs.

Table 7: Refined five-factor solution for the Bangladesh dataset (2,245).

The refined five-factor solution contains three sexual and reproductive health variables that loaded strongly onto the first factor. All four education variables loaded onto the second factor (three of them strongly), while two marriage and relationship variables loaded strongly onto the third factor. Four time-use variables loaded onto the fourth factor, with three of them loading strongly, while two financial and economic empowerment variables loaded strongly onto the fifth factor. Similarly to Ethiopia, each of the five domains in the GAGE gender norms scale loads onto one of the five factors.

The communalities and uniqueness values are comparable to the Ethiopia factor solution. No very low communalities and no uniqueness values higher than 0.8 remain. Only five uniqueness values are slightly above 0.6. As in the Ethiopia dataset, the cumulative variance accounts for an acceptable proportion of 50.57%.

Comparing the refined five-factor solutions in Table 8, each factor corresponds exactly to one of the five domains. With the exception of the two additional variables in the Ethiopia dataset and one variable that loads weaker in the Bangladesh dataset, the same variables load with the same strength onto the factors. Table 8 also shows that individual-level gender attitudes and community-level gender norms are mixed within the domains (at least where a sufficient number of variables is available per factor). This further supports the notion that the individual-community distinction is not suitable for the available GAGE gender norms items in Ethiopia and Bangladesh.

Table 8: Comparison of the refined five-factor structure with variables loading significantly onto a factor in the Ethiopia and the Bangladesh datasets.

4 Discussion

4.1 Principal findings

To assess whether the items and domains of the GAGE gender norms scale contribute to explaining gender norms and attitudes in Ethiopia and Bangladesh, we explored the implied two- and five-factor structures (both in the GAGE gender norms scale and in Baird et al.’s analysis [29]) using EFA. The two-factor structure, distinguishing individual-level gender attitudes and community-level gender norms, showed an unsatisfactory model fit in both datasets due to the pattern of factor loadings not reflecting the individual-community distinction as well as low communalities, high uniqueness values and low cumulative variance explained. In contrast, the initial five-factor solution showed indications of a five-domains structure (education; sexual and reproductive health; relationships and marriage; time use; and financial and economic empowerment). After having refined the five-factor solution, we found that the variables captured their respective domains. However, this was only the case for a reduced set of gender norms items in both datasets. Therefore, we propose an adaptation of the GAGE gender norms scale for Ethiopia and Bangladesh.

Concerning the inadequate two-factor solution, we hypothesise that the distinction between gender attitudes and gender norms in this dataset is not substantial enough to yield two factors in the EFA. This may stem from the community-level gender norms being reported by adolescents themselves, rather than other community members. Consequently, their responses likely reflect their own perspective on community norms rather than the actual norms within the community. Another reason may be that the multidomain structure of the GAGE gender norms scale cannot be captured by a two-factor solution as this would oversimplify the complex structure of the gender norms items.

Evidence for a two-factor solution of gender norms items has been observed in the Nepali context, with the G-Norm scale differentiating between descriptive and injunctive norms [3]. The G-NORM scale has been incorporated into the GAGE gender norms scale, which also contains descriptive and injunctive norms [29]. However, Baird et al.’s analysis does not account for the descriptive-injunctive distinction, instead creating separate scores for individual-level gender attitudes and perceived community-level gender norms.

A recent study among adolescents in Bangladesh also successfully validated a multi-domain structure for a ‘gender norms attitude’ scale (M-GNAS) [5]. This four-domain structure includes 13 items and covers the areas of gender attitudes identified as dominant among Bangladeshi youth: gender-appropriate behaviour, family financial decisions, family responsibility, and career choice [5]. Islam et al., (2024) [5] highlighted that the lack of women’s empowerment in decision-making and deeply entrenched gender norms within the family and socioeconomic context are specific to the Bangladeshi context. This makes the time-use variables in the GAGE gender norms scale particularly important.

The exclusion of certain items in the refined five-factor solutions may indicate conceptual issues in the GAGE gender norms scale. One reason could be that these items do not align closely enough with the content represented in the factor structures. The item ‘cr_edu_boysfeelings’, for example, asks, whether “boys should be able to show their feelings without the fear of being teased”. This statement does not only apply to a school context, nor is it similar to the other school-related questions. Yet, the item was assigned to the education domain. Furthermore, this item contains two questions in one – whether boys should be able to show their feelings and whether they should be able to do so without fear of being teased. This makes it challenging to interpret the responses. Another conceptual issue is the discrepancy between the content of the items and the domains. In particular, the sexual and reproductive health domain includes one item about how girls perceive their bodies in the transition to womanhood, but the remaining items ask about control (e.g., “Families should control their daughters’ behaviors more than their sons”), which is not specifically related to sexual and reproductive health. Although sexual and reproductive health questions should be asked with caution when surveying young adolescents, this domain does not cover the content it suggests. To align the questions on control more closely with the domain of sexual and reproductive health, specific behaviours could be asked, such as the choice of contraceptive method or the decision when to have children, etc.

Another limitation of the GAGE gender norms scale could be its focus on a binary understanding of sex, along with gendered questions that are only applicable to one of the two sexes. This conceptual aspect likely applies to most existing gender norms scales. Nevertheless, gender norms could be assessed in a way that makes them applicable to both or all genders. Moreover, gender norms and attitudes in the GAGE gender norms scale are sometimes measured indirectly through other constructs, such as gender roles, gender stereotypes, and traits [5]. For example, items in the time-use domain, such as “girls and boys should share household tasks equally”, may relate more to the construct of gender roles than gender norms. This reflects the inconsistency in the definition of gender norms, the conflation of terms, and the challenges in the conceptualisation of gender-related aspects in the scientific literature.

Comparing the refined five-factor solutions for Ethiopia and Bangladesh, we found that the 15 gender norms items in Bangladesh are the same 15 gender norms items in Ethiopia, with two additional items specific to the Ethiopia dataset. This demonstrates that a subset of gender norms items was retained consistently across two different geographical contexts. We hypothesise that these variables may be more robust across cultures. Although further research is needed, these variables could serve as a basis for improving the GAGE gender norms scale and for facilitating cross-cultural comparisons. In contrast, the remaining 17 and 15 variables out of the original 30 items indicate that up to half of the items may not be well-suited for the GAGE gender norms scale in Ethiopia and Bangladesh. These items may need to be refined to enhance their alignment with the five-domain structure. This underscores the importance of validating the scale for each cultural context [5,38].

Our results may be generalisable to countries with a similar cultural context to Ethiopia and Bangladesh. Depending on the extent of differences in attitudes to gender norms between different groups of adolescents or regions within a country, it may be useful to conduct EFA separately for subgroups. For example, the Afar and Oromia regions in Ethiopia have been identified for their different and deeply entrenched gender norms [28]. It is possible that similar subgroups of adolescents in different countries could have more similar attitudes to gender norms than different subgroups within a country. While it is recommended to use the list of items that fits best for the specific country or subpopulation (e.g., 15 items for the Bangladesh and 17 items for the Ethiopia dataset), it might be beneficial to use the same set of items when conducting comparisons between countries or subgroups (e.g., 17 items for both datasets) [20].

4.2 Strengths and limitations

In our study, we followed methodological recommendations in performing our EFA by using PFA as the conceptually desirable method compared to PCA [33,39], by using oblique rather than orthogonal rotation due to the high correlations among the gender norms items [34], by using relatively large samples (which are particularly important in EFAs) [34] from two LMI countries, and by conducting several factor models [40]. The latter increases the likelihood that the results can be generalised if the same factor structures are retrieved [34]. This may be the case for 15 gender norms items in our analysis.

The limitations of our study are twofold. First, there are limitations in the GAGE datasets which we rely on in our secondary data analysis. Despite being administered by skilled and trained interviewers, the GAGE data may still be influenced by social desirability and other biases, such as measurement invariance. Additionally, the GAGE gender norms items reflect perceived rather than objective measures. Also, some items in the scale may need to be rephrased to produce more valid outcomes.

Furthermore, the provided response options – ‘agree’, ‘partially agree’ and ‘disagree’ - are unbalanced due to the absence of a weak negative response option, such as ‘partially disagree’. Recent gender norms scales employ five-point Likert scales, which provide an additional neutral response option [3,5]. Additionally, two factors in the refined five-factor solution (namely marriage and relationship and financial and economic empowerment norms) each consist of only two variables in both datasets. This is usually considered an insufficient amount for factor analysis. The GAGE data may benefit from additional variables in both domains.

Second, our analysis has its own methodological limitations. We did not try other rotation and extraction methods other than oblimin principal axis factoring in our EFA. Moreover, we did not analyse the factor structures between different subgroups of adolescents considered vulnerable.

To note, factor structures were interpreted based on their factor loadings, communalities, and uniqueness values, as well as their cumulative variance explained. Gender norm scales should not be developed based on statistical findings alone, but should also incorporate theoretical and conceptual considerations. Particularly the 15 and 13 excluded items in the refined five-factor solutions should be culturally and linguistically rephrased to increase cultural validity and avoid misinterpretations. Moreover, items with lower communalities may not align well with the intended domain or categorisation, or may overlap poorly with other items within the same domain. These items may need to be adapted or assigned to another domain and only items without strong theoretical rationale may need to be excluded.

The rather low cumulative variances explained, in the initial factor solutions, may be attributed to the cultural complexity, context sensitivity, and the multidimensional nature of gender norms. The cultural heterogeneity within Bangladesh and Ethiopia, as well as literacy and interpretation issues among marginalised adolescents in LMI settings may also have contributed. In social sciences, the cumulative variance explained tends to be rather low, particularly when attitudes or perceptions are studied [37].

Lastly, we conducted EFA and not Confirmatory Factor Analysis (CFA). Besides further refining the GAGE gender norms items, conducting CFA or Item Response Theory could be necessary next steps to validate the scale. Further qualitative validation (e.g., cognitive interviewing) could support the process of refining the GAGE gender norms scale and enhance the age- and culture-appropriate adaptation of items.

4.3 Conclusion

Gender norms research faces a number of challenges in defining, measuring, and conceptualising gender norms. This applies particularly to gender norms scales for adolescents. Our results indicate that separating the GAGE gender norms items into a two-factor structure with an individual-community distinction may not reflect the data structure for the Ethiopia and Bangladesh datasets. Instead, a multi-domain structure seems more promising. In fact, only our refined five-factor solutions yielded promising results with a reduced set of items. Due to issues with wording or domain assignment of some gender norms items, we propose an adaptation of the GAGE gender norms scale. This could include a cultural and linguistical adaptation, a stronger theoretical foundation and an improved assignment of items to the different domains. With our EFAs, we contribute to improving the validity of a gender norms scale for adolescents in two LMI countries. A refined gender norms scale could be valuable for developing more effective interventions in Ethiopia and Bangladesh to reduce inequitable gender norms.

Supporting information

S1 TableVariable names of the GAGE gender norms items in the Ethiopia and Bangladesh datasets.(DOCX)

Bibliography40

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Khan A, Khan S, Khan MA, Zaman K, Khan HUR, Rosman ASB, et al. Economic costs of gender inequality in health and the labor market: India’s untapped potential. Front Public Health. 2023;11:1067940. doi: 10.3389/fpubh.2023.1067940 36794076 PMC 9922756 · doi ↗ · pubmed ↗
2Weber AM, Cislaghi B, Meausoone V, Abdalla S, Mejía-Guevara I, Loftus P. Gender norms and health: insights from global survey data. The Lancet. 2019;393(10189):2455–68.10.1016/S 0140-6736(19)30765-231155273 · doi ↗ · pubmed ↗
3Sedlander E, Dahal M, Bingenheimer JB, Puri MC, Rimal RN, Granovsky R, et al. Adapting and Validating the G-NORM (Gender Norms Scale) in Nepal: An Examination of How Gender Norms Are Associated with Agency and Reproductive Health Outcomes. Stud Fam Plann. 2023;54(1):181–200. doi: 10.1111/sifp.12231 36715570 · doi ↗ · pubmed ↗
4Moreau C, Li M, De Meyer S, Vu Manh L, Guiella G, Acharya R, et al. Measuring gender norms about relationships in early adolescence: Results from the global early adolescent study. SSM Popul Health. 2018;7:014–14. doi: 10.1016/j.ssmph.2018.10.014 30581959 PMC 6293033 · doi ↗ · pubmed ↗
5Islam A, Anwar Siraji M, Haque M, Salim Chowdhury M. Development of a multidomain gender norm attitude scale for youth in Bangladesh. Prev Med Rep. 2024;45:102848. doi: 10.1016/j.pmedr.2024.102848 39205915 PMC 11350248 · doi ↗ · pubmed ↗
6Sedlander E, Bingenheimer JB, Long MW, Swain M, Rimal RN. The G-NORM Scale: Development and Validation of a Theory-Based Gender Norms Scale. Sex Roles 2022; 87(5-6):350–63. Available from: URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC 9508194/36168556 10.1007/s 11199-022-01319-9PMC 9508194 · doi ↗ · pubmed ↗
7Hill AL, Miller E, Switzer GE, Abebe KZ, Chang JC, Pulerwitz J, et al. Gender Equitable Attitudes Among Adolescents: A Validation Study and Associations with Sexual Health Behaviors. Adolesc Res Rev. 2022;7(4):523–36. doi: 10.1007/s 40894-021-00171-4 38895164 PMC 11185410 · doi ↗ · pubmed ↗
8Weziak-Bialowolska D. Differences in Gender Norms Between Countries: Are They Valid? The Issue of Measurement Invariance. Eur J Popul. 2015;31(1):51–76. doi: 10.1007/s 10680-014-9329-6 25663730 PMC 4315909 · doi ↗ · pubmed ↗