Dynamics of racial segregation and gentrification in New York City
Felipe G. Operti, Andr\'e A. Moreira, Andrea Gabrielli, Hernan Makse,, Jos\'e S. Andrade Jr

TL;DR
This study examines the evolution of racial segregation and gentrification in New York City from 1990 to 2010, revealing patterns of increasing segregation between certain groups and significant displacement linked to gentrification.
Contribution
It provides a detailed analysis of racial segregation dynamics, income inequality, and property value changes, highlighting the spatial displacement associated with gentrification in NYC.
Findings
Segregation between white and Hispanic, and white and Asian increased.
Black-white segregation remained stable.
Gentrified regions showed significant displacement (~1.6 km).
Abstract
Racial residential segregation is interconnected with several other phenomena such as income inequalities, property values inequalities, and racial disparities in health and in education. Furthermore, recent literature suggests the phenomena of gentrification as a cause of perpetuation or increase of racial residential segregation in some American cities. In this paper, we analyze the dynamics of racial residential segregation for white, black, Asian, and Hispanic citizens in New York City in the years of 1990, 2000, and 2010. It was possible to observe that segregation between white and Hispanic citizens, and discrimination between white and Asian ones has grown, while segregation between white and black is quite stable. Furthermore, we analyzed the per capita income and the Gini coefficient in each segregated zone, showing that the highest inequalities occur in the zones where there…
| 1990 | 2000 | 2010 | |
|---|---|---|---|
| White and Black | 0.22 | 0.19 | 0.20 |
| White and Hispanic | 0.61 | 0.53 | 0.47 |
| White and Asian | 0.82 | 0.73 | 0.67 |
| Black and Hispanic | 0.52 | 0.52 | 0.61 |
| Black and Asian | 0.27 | 0.24 | 0.26 |
| Hispanic and Asian | 0.58 | 0.48 | 0.29 |
| Area1990 () | Area2010 () | Displacement2010-1990 () | |
| A | 30.7 | 32.8 | 1.55 |
| B | 38.0 | 54.6 | 0.44 |
| C | 41.8 | 44.1 | 1.57 |
| D | 37.3 | 58.2 | 0.64 |
| 1990 | 2000 | 2010 | |
|---|---|---|---|
| White and Black | 0.81 | 0.80 | 0.79 |
| White and Hispanic | 0.64 | 0.64 | 0.62 |
| White and Asian | 0.47 | 0.50 | 0.51 |
| Black and Hispanic | 0.58 | 0.58 | 0.54 |
| Black and Asian | 0.78 | 0.78 | 0.76 |
| Hispanic and Asian | 0.56 | 0.58 | 0.58 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUrban, Neighborhood, and Segregation Studies · Housing Market and Economics · School Choice and Performance
Dynamics of racial segregation and gentrification in New York City
Felipe G. Operti1, André A. Moreira1, Andrea Gabrielli2, Hernan Makse3, and José S. Andrade Jr.1
1 Departamento de Física, Campus do Pici, Universidade Federal do Ceará, 60451-970, Fortaleza, Ceará, Brazil
2 Istituto dei Sistemi Complessi (ISC) - CNR, UoS Sapienza, Dipartimento di Fisica, Università Sapienza, P.le Aldo Moro 5, 00185, Rome, Italy
3 Levich Institute and Physics Department, City College of New York, 10031, New York, New York, USA
\Yinyang
These authors contributed equally to this work.
Abstract
Racial residential segregation is interconnected with several other phenomena such as income inequalities, property values inequalities, and racial disparities in health and in education. Furthermore, recent literature suggests the phenomena of gentrification as a cause of perpetuation or increase of racial residential segregation in some American cities. In this paper, we analyze the dynamics of racial residential segregation for white, black, Asian, and Hispanic citizens in New York City in the years of 1990, 2000, and 2010. It was possible to observe that segregation between white and Hispanic citizens, and discrimination between white and Asian ones has grown, while segregation between white and black is quite stable. Furthermore, we analyzed the per capita income and the Gini coefficient in each segregated zone, showing that the highest inequalities occur in the zones where there is overlap of high-density zones of pair of races. Focusing on census tracts that have changed density of population during these twenty years, and, particularly, by analyzing white and black people’s segregation, our analysis reveals that a positive flux of white (black) people is associated to a substantial increase (decrease) of the property values, as compared with the city mean. Furthermore, by clustering the region of high density of black citizens, we measured the variation of area and displacement of the four biggest clusters in the period from 1990 to 2010. The large displacements ( ) observed for two of these clusters, namely, one in the neighborhood of Harlem and the other inside the borough of Brooklyn, led to the emergence of typically gentrified regions.
Keywords Racial residential segregation; Gentrification; City Clustering Algorithm
Introduction
Although it is not a recent phenomenon, racial residential segregation (RRS) continues to permeate the United States metropolitan areas and it is still an object of study for scientists of different areas [1, 9, 13, 14, 15, 10, 2, 3, 4, 11, 22, 5, 6, 16, 12, 17, 23, 7, 18, 24, 25, 8, 19, 20, 21]. The decrease of RRS in American cities is controversial and drastically varies from one city to another. Furthermore, it shows different trends according to the race analyzed. For example, several studies show that the segregation between white and black citizens has decreased in the last fifty years [9, 10, 11, 12]. Instead, segregation between white and Hispanic, and white and Asian citizens has increased [11, 12].
Several indexes were developed to quantify RRS [1, 13, 14, 15, 16, 17, 18, 19, 20, 21]. The first and still most used nowadays is the dissimilarity index created by Duncan and Duncan in 1955 [21]. Subsequently, in 1988, Massey and Denton [19] defined five distinct axes of measurement of residential segregation: evenness, exposure, concentration, centralization, and clustering. The authors affirmed that, in order to fully analyze residential segregation, at least five indexes corresponding to the five spatial dimensions are necessary. Meanwhile, in 2004, Reardon and O’Sullivan’s developed several measures of multigroup segregation and, among them, the authors consider the Information Theory Index the most conceptually and mathematically satisfactory measure to quantify residential segregation [17].
RRS is the cause and effect of several inequalities. Studies show the relations between racial segregation and income inequalities [22] and property values inequalities. Furthermore, RRS causes racial disparities in health and in education [22, 23, 24, 25]. In New York City, for instance the mortality rates of black citizens vary substantially by locality according to the pattern of racial segregation [25].
In the recent years, some researches also suggest that the phenomena of gentrification is a cause of perpetuation or even of the increase of RRS [26, 27, 28, 29]. Gentrification is defined by The Encyclopedia of Housing [30, 31] as:
The process by which central urban neighborhoods that have undergone disinvestment and economic decline experience a reversal, reinvestment, and the in-migration of a relatively well-off, middle and upper middle-class population.
The main reason to indicate gentrification as a cause of perpetuation of racial segregation is the presumed displacement of the low-income class, in many cases predominantly black or Hispanic citizens, from their native neighborhood during the gentrification process [26, 29, 30, 32, 33]. Taking the example of New York City once again, there is an intense debate about the gentrification of regions inside the neighborhoods of Harlem and the borough of Brooklyn [34, 35, 36].
The aim of this paper is to study the dynamics of RRS in New York City from 1990 to 2010. Here, we developed a novel method able both to measure RRS and to delimit the segregated zones. Indeed, differently from previous measures, our method, in addition to quantifying the phenomena, provides a topography of the segregation. Furthermore, in the section Comparison with the Dissimilarity index, we compare our segregation index, the Overlap coefficient, with the dissimilarity index.
With the limit of the segregated zones, we analyze the per capita income in each high-density zone of population (defined for each race) and also in the zones of overlaps between them. In order to quantify income inequality, we calculate the Gini coefficient in each zone. Then, we study the variation of the per capita income and of the properties’ value for the census tracts that change zone during these twenty years. Finally, we focus on the segregation between white and black citizens. Particularly, we use a simplified version of the City Clustering Algorithm (CCA) [38, 39, 40, 41, 42, 44, 45, 46, 47] to cluster the high-density zone of black citizens and to measure the displacement and the area of the four biggest clusters (one of these clusters includes the neighborhood of Harlem and another one is inside the borough of Brooklyn).
The paper is structured as follows: first, we introduce our method. Then, we present the results of the application of the method to New York City. Finally, we draw the conclusion about the results. In the Appendix A we provide the information for the acquisition of the data.
Method
The method consists of the following steps: first we define the limits of the city using the City Clustering Algorithm (CCA) [38, 39, 40, 41, 42, 44, 45, 46, 47]. Second we find the high-density zones for white, black, Asian, and Hispanic citizens. Finally, we measure the RRS through the Overlap Coefficient.
The CCA is an algorithm introduced to define boundaries of metropolitan areas [38, 39, 40, 41, 42, 44, 45, 46, 47]. Its result depends on two parameters: a population density threshold (in ), and a cutoff length (in ). The elementary information for population data are provided in census tract. Where the tracts are geographic regions defined by the United States Census Bureau [37] (see Appendix A for more information about the database). For each tract, we have the total area and the total population given by the sum of people of each race. Therefore, for each tract, its population density is calculated. According to the CCA, the assumption is that only the tracts with are populated.
The next step of the algorithm is the clusterization. In this step, we define the urban center. For each populated tract, we draw a circle of radius with center in the centroid of the tract. All populated tracts that have the centroid inside the circle belong to the same cluster, and, therefore, the same city. The parameter and are chosen respecting the isometry between area and population of the cities [38, 39, 40]. The algorithm is applied in the entire country and, subsequently, we extract only the cluster equivalent to New York City.
The importance of using the CCA to define the urban area of New York City is due to the fact that RRS deeply depends on the definition of urban areas [9, 13, 15]. For example, it was shown in [38, 40] that the Metropolitan Urban Areas (MSA) have large inhabited regions. Instead, the aim of our research is to analyze RRS in a very dense urban area, specifically in New York City.
We define the high-density (HD) zones as regions inside the city with a high population density of a specific race. The HD zone of a specific race is defined applying a density threshold . We consider the tracts with populated of that race. is the population density of that race. The choice of parameter is made by studying how the fraction of population of race , with respect of the total population of the same race inside the whole city, depends on it. Therefore, for each race , we define a parameter as:
[TABLE]
To make the analysis as uniform as possible, we choose so that both and take similar values for all considered races .
In Fig 1 we show the variation of the parameter in function of parameter for each race in New York City. We consider the same fraction of people in three cases using a similar : when it is next to [math], to , and . The first two are trivial, in fact they show respectively all and any population. While in ( of the total population for each race), for each race . The dotted black line in the Figure is exactly in showing the of the total population of each race.
Parameter has been tested in the interval from to without find deep discrepancies in the results. Therefore, at the end of this step, the method provides well-defined geographic limits of the HD zones for each race.
From the definition of the HD zones, we measure the RRS between two races computing the sharing area (or overlap area) between the two HD zones. Therefore, we define the Overlap coefficient (or Szymkiewicz-Simpson coefficient [48]) as:
[TABLE]
where and are respectively the HD zone areas of races and . Coefficient is the sharing area between the HD zone and the HD zone divided by minimum area between the two zones. The Overlap coefficient is included between [math] and . When it is next to [math] (low overlap), the coefficient indicates high segregation, while when it is next to (high overlap), it indicates low segregation (see Table 1).
Results
Firstly, we define the limits of New York City by applying the CCA to the population data in 2010 (see Appendix A for more details about the data). Then, we calculate the HD zone for white, black, Asian, and Hispanic for the year of 1990, 2000, and 2010. In Fig 2, we show the HD zone for white and black citizens with the respective Overlap zone in the year 2010.
For each pair of races, we calculate the Overlap coefficients and the results were presented in Table 1.
The Table shows that the segregation between white and black, and black and Asian citizens remains quite stable during the time interval. While segregation between white and Hispanic, white and Asian, and Hispanic and Asian has increased, the segregation between black and Hispanic citizens has decreased. Black people are constantly the most segregated having a high overlap coefficient only with Hispanic.
After the definition of the HD zones and the Overlap zones, we calculate the average per capita income of each race inside each zone for the years of 1990, 2000, and 2010. The results are presented in Fig 3, where “only” means the HD zone without the Overlap zone. The Figure shows that white citizens earn more than all the other races in all the zones except in the study of the segregation between white and Asian citizens. Black and Hispanic citizens earn less than whites in all the zones. Moreover, the Figure shows that income inequality between white and black citizens is greater in the Overlap zone than in the only white zone and the only black zone.
To study the per capita income inequalities for each study of segregation (white and black, white and Hispanic, and white and Asian), we calculate the Gini coefficient [49] inside each of them. The results are presented in Fig 4. The Gini coefficient varies from [math] to . When it is next to [math], there is not inequality, while when it is next to , inequality is maximum [49]. The Figure shows that inequality is greater in the Overlap zones in all cases in favor of whites.
Furthermore, we analyze the tracts that migrated from one zone to another from 1990 to 2010 for the studies of segregation between: white and black citizens in Fig 5; white and Asian citizens in Fig 6; and white and Hispanic citizens in Fig 7. The colors in the maps in Figs 5-6-7 show the alternatives of migration of the tracts from one zone to another, which are described in the caption. For each alternative, we calculate the average variation of the per capita income () and the average variation of the properties values () normalized by the average variation in the city ( and ) from 1990 to 2010. The variations are defined as:
[TABLE]
and,
[TABLE]
Where is the number of tracts of the analyzed pairs of races and and are the variations of the per capita income and properties values of tract , respectively. Therefore, positive or mean growth higher than the city mean, while, conversely negative or mean growth lower than the city mean.
Moreover, we focus on the segregation between white and black citizens and the flux of people from 1990 to 2010 inside the tracts that migrated from one zone to another or to the Overlap zone. The flux of people of a specific race inside a tract is the variation of people of that specific race inside tract compared with the mean variation of that specific race in the whole city. Similarly to Eq 3 and 4, the average flux is defined:
[TABLE]
where is the mean flux of race in the whole city.
In Fig 8, still focusing on the segregation between white and black citizens, we show: the variation of income; the variation of properties values; and the flux of people in the tracts that change zone between the years 1990 and 2010.
For those tracts, in Fig 9 we compare the variation of the flux of white and black citizens with the variation of the properties values. In Fig 9a, we show the outgoing white flux in orange where the red square is the centroid. In blue, we show the incoming white flux, where the black circle is the centroid. While in Fig 9b we show the outgoing black flux in green and the red square is the centroid. The incoming black flux in the considered tracts is shown in red and the black circle is the centroid. The figures show that where the flux of white citizens is on average positive, also the properties values increase more than the mean, as well as where the flux of black citizens is negative on average.
To investigate the dynamics and the displacement of black citizens in New York City, we study the HD black zone. With a simplified version of the CCA we divide in clusters the HD black zone. Indeed, we ignore the threshold and we apply the cutoff length . The parameter is chosen by analyzing the distribution of the tracts area. Each tract area is considered as a circle with the same area. The mean radius has been found to be , therefore in order to consider two neighbors tracts as part of the same cluster, we use . The results of the clusterization for the years 1990 and 2010 are shown in Fig 10. In the Figure, we highlight the four biggest clusters A, B, C, and D.
For the four biggest clusters (A, B, C, and D), in Table 2 we show the area of each of them for the years 1990 and 2010 and also the displacement of clusters’s centroid, highlighting the fact that cluster A and C have a displacement about three times higher than clusters B and D. In Fig 11, we show the displacement of clusters A and C from 1990 to 2010. The cluster A includes a region in the neighborhood of Harlem, while the cluster B is inside the boroughs of Brooklyn. In the same Figure, we also show the variation of the per capita income for the tracts that change zone in the analyzed period.
Comparison with the Dissimilarity index
In order to verify the robustness of our method, we compare the Overlap coefficient defined in Eq 2 with the dissimilarity index [21]:
[TABLE]
where is the population of race in tract and , the population of race in the same tract. and are the total population of race and in the whole city, where the city is defined using the CCA. are all the tracts that belong to New York City. The value of varies from [math] to . When it is next to , RRS is high, and vice versa, when it is next to 0 there is not segregation. It shows the percentage of one of the two populations that have to move in order to reduce segregation to [math] [21]. The results obtained in New York City are shown in Table 3.
To analyze the correlation between the two indexes, we plot the dissimilarity indexes found in New York City as a function of their respective Overlap coefficients (where is the HD zone of race , and of race ) in Fig 12. The red line in the Figure shows the result of the Ordinary least Square (OLS). As expected, the relation is inverse with a linear coefficient . Whereupon, in order to quantify the correlation between the two indexes, we calculated the Pearson correlation coefficient (PCC), . The value implies a strong inverse correlation between the two indexes, proving the robustness of our method.
Discussion
We developed a new method in order to measure and to define the topography of RRS and we applied it to the metropolitan area of New York City for the years of 1990, 2000, and 2010. Despite the fact that several studies show that, on average, segregation between white and black citizens in the United States has decreased in the last fifty years [9, 10, 11, 12], our results show that it has remained quite stable during the time interval 1990-2010 in the metropolitan area of New York City as well as for black and Asian citizens. Instead, segregation between white and Hispanic, white and Asian, and Hispanic and Asian citizens has grown. Only black and Hispanic are less segregated in 2010 compared with 1990.
By analyzing the per capita income, we observe that white citizens earn more than the other races in all the regions, except when we analyze the segregation between whites and Asian, where Asian citizens have a similar income to white citizens. Regarding the segregation between white and black citizens, we verify that black citizens earn less than white citizens in all the regions. Furthermore, the inequality between white and black citizens is greater in the regions of high density of population of both the races. This result is confirmed by the Gini coefficient, in fact we show that it is higher in the regions of high density of population of two or more races.
Furthermore, we study the segregation between white and black and the segregation between white and Hispanic citizens. We analyze the tracts that change population density from 1990 to 2010 (from region of high density of black, Hispanic, or overlap with white citizens) to region of only high density of white citizens. In this region, we observe that the per capita income and the properties values increased more than the city mean. Conversely, in the tracts that migrated from a region of overlap to a region with high density of population of only black or Hispanic citizens we observe that the per capita income and the properties values increased less than the mean.
Focusing on the segregation between white and black citizens, we analyze the flux of white and black citizens in function of the variation of the properties values. Where the flux of white citizens is positive, the properties values increased more than the city mean, while, where the flux of black citizens is positive, the properties values increased less than the city mean.
Previous studies [34, 35, 36] questioned the effects of gentrification in the neighborhood of Harlem and in the borough of Brooklyn. Here, by clustering the region of high density of black citizens, we show the displacement of the clusters defined as A (that include a region inside the neighborhood of Harlem) and B (that is inside the borough of Brooklyn). The displacement is of respectively and in twenty years. This result confirms the theory of displacement of black citizens in the neighborhood of Harlem and in the borough of Brooklyn.
Acknowledgments
We gratefully acknowledge CNPq, CAPES, FUNCAP and the National Institute of Science and Technology for Complex Systems in Brazil for the financial support. We especially thank our colleagues and friends of the Complex System group of the Universidade Federal do Ceará for the countless discussions. We thank Samuel Morais da Silva and Saulo D. S. Reis for the valuable discussions.
Appendix A
Dataset
All the data used in this paper is extracted from the National Historical Geographic Information System (NHGIS) [50]. The platform provides population, housing, agricultural, and economic data with GIS-compatible boundary files for geographic units in the United States from 1790 to the present. From the platform, population data has been extracted according to race, per capita income data, and the number of owner-occupied housing units by value.
Population dataset (TABLE CW7 Persons by Hispanics or Latino origin by race). The data provides the number of people for each race for the years of 1990, 2000, and 2010 divided by Hispanic or Latino and Not Hispanic or Latino. We consider white as Not Hispanic or Latino: white (single race), black as Not Hispanic or Latino: black or African American (single race), Asian as Not Hispanic or Latino: Asian or Pacific Islander (single race), and Hispanic as Hispanic or Latino: white (single race) plus Hispanic or Latino: black or African American (single race) plus Hispanic or Latino: Asian and Pacific Islander (single race). The data table is downloadable with the respective GIS-compatible boundary file formed by census tracts standardized to the 2010 census [37].
Per capita income dataset (BD5 Per capita Income in the Previous Year): The data provides the average per capita income of each American census Tract in the previous year of 1980, 1990, 2000, and between 2008 and 2012. The values are not adjusted for inflation.
Properties values dataset (NH23 Specified owner-occupied housing units and B25075 Owner-occupied housing units): The properties values data are divided into two databases: the table NH23, for the year of 1990, and the table B25075, for the years between 2006 and 2010. The tables provide the number of houses in each price range. The price ranges are divided as: in the table NH23, in twenty ranges, and, in table B25075, in twenty-four ranges from zero Dollar to infinity. For each tract, the weighted arithmetic mean of the properties values has been calculated. The table B25075 is provided in the 2012 census tract and it is consistent with the Population data and the per capita income data, whereas table NH23 is provided in 1990 tracts. Therefore, through a superimposing process, the data was recomposed in the 2012 Census Tract. The superimposing process consists in considering all the properties in a 1990 census tract with centroid in a 2012 census tract as part of that 2012 census tract.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Chodrow P S. Structure and information in spatial segregation. Proceedings of the National Academy of Sciences. 2017. 114 (44): 11591-11597. doi: doi.org/10.1073/pnas.1708201114.
- 2[2] Lichter D T, Parisi D, and Taquino M C. Toward a New Macro-Segregation? Decomposing Segregation within and between Metropolitan Cities and Suburbs. American Sociological Review. 2015,. 80 (4): 843-873. doi: 10.1177/0003122415588558.
- 3[3] Fowler C S. Segregation as a multiscalar phenomenon and its implications for neighborhood-scale research: the case of South Seattle 1990–2010. Urban Geography. 2016. 37 (1): 1-25. doi: dx.doi.org/10.1080/02723638.2015.1043775.
- 4[4] Boustan L P. Racial Residential Segregation in American Cities. The Oxford Handbook of Urban Economics and Planning. 2012. doi: 10.1093/oxfordhb/9780195380620.013.0015.
- 5[5] Readon S F, Farrell C R, Matthews S A, O’Sullivan D, Bischoff K, and Firebaugh G. Race and space in the 1990 s: Changes in the geographic scale of racial residential segregation, 1990–2000. Social Science Research. 2008. 38: 55-70. doi: 10.1016/j.ssresearch.2008.10.002.
- 6[6] Readon S F, Mathhews S A, O’Sullivan D, Lee B, Firebaugh G, Farrell C R, and Bischoff K. The geographic scale of metropolitan racial segregation. Demography. 2008. 45 (3): 489-514. doi: doi.org/10.1353/dem.0.0019.
- 7[7] Charles C Z. The dynamics of Racial Residential Segregation. Annual Review of Sociology. 2003. 29: 167-207. doi: 10.1146/annurev.soc.29.010202.100002.
- 8[8] Massey D S and Denton N A. American Apartheid. Harvard Univ Pr. isbn-10: 0674018214.
