PurpleAir Sensor Deployment Trends and Uncertainties
Chloe S. Chung, Annette C. Rohr

TL;DR
This study examines where and how long PurpleAir air quality sensors are deployed across the U.S., showing their potential for long-term air quality monitoring and analysis.
Contribution
The study provides the first national assessment of PurpleAir sensor deployment longevity and geographic distribution from 2016 to 2025.
Findings
Most publicly shared PurpleAir sensors remained deployed for over three years, especially in the western U.S.
Sensor density and longevity were highest in the western U.S., supporting multi-year exposure assessments.
Descriptive summaries of PM2.5 in four states demonstrate the utility of these networks for urban–rural comparisons.
Abstract
Low-cost air quality sensors, such as PurpleAir monitors, have rapidly expanded fine particulate matter (PM2.5) monitoring across the United States, providing dense, hyper-local measurements. While prior research has focused largely on sensor accuracy and calibration, less is known about where these sensors are deployed and whether they persist long enough to support multi-year analyses relevant to exposure assessment and policy. Using publicly available PurpleAir data, we characterized the geographic distribution, deployment longevity, and persistence of outdoor sensors across the United States from 2016 to 2025. We quantified deployment duration as the time between first and last publicly available observations and summarized patterns nationally, by U.S. Census region, and by state. Most publicly shared sensors remained deployed for more than three years, indicating substantial…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting · Air Quality and Health Impacts · Mobile Crowdsensing and Crowdsourcing
1. Background
Low-cost air quality sensors, such as those produced by PurpleAir, have become widely deployed across the United States (U.S.) and internationally. These sensors provide real-time, hyper-local measurements of fine particulate matter (PM2.5), substantially expanding spatial coverage beyond regulatory monitoring networks. A growing body of validation studies has evaluated PurpleAir sensor performance relative to reference-grade monitors, demonstrating moderate-to-strong agreement after calibration while identifying systematic biases related to environmental conditions. Specifically, PurpleAir sensors rely on optical particle counters that are sensitive to relative humidity, temperature, and aerosol composition, often overestimating PM2.5 during wildfire smoke or high-humidity conditions; accordingly, laboratory-based, region-specific, and national corrections—including the U.S. EPA’s equation—have been developed [1,2,3] to address those issues. Beyond technical validation, these networks increasingly inform community engagement and situational awareness: a community-led effort in Waterbury, Connecticut, found strong agreement with reference monitors and supported local action [4], and the U.S. EPA integrates PurpleAir data into the national Fire and Smoke Map to inform real-time decisions during smoke events [5].
Despite these benefits, several constraints limit direct application to exposure assessment and policy. Deployments often lack standardized oversight and technical support; uncalibrated data can mislead users and may become nonlinear at very high concentrations (>300 µg/m^3^), complicating interpretation during severe events [6]. Many users also lack resources to deploy and interpret sensors appropriately, and common standards for reporting and metadata are still emerging [7,8]. Beyond performance, access is uneven: sensors cluster in higher-income, higher-education tracts, leaving disadvantaged communities with fewer monitors despite higher burdens [9,10,11]. A key open question is whether community deployments persist long enough to support long-term regulatory-relevant analyses that typically rely on multi-year (e.g., three-year) design values.
This study addresses this gap by quantifying the longevity of sensor operation, geographic coverage, and multi-year persistence of publicly shared PurpleAir sensors across the United States. In doing so, it complements prior work focused on sensor validation and calibration by providing a national-scale characterization of where low-cost sensors are deployed and how consistently they operate over time—information that is foundational for exposure modeling and policy-relevant applications.
Rather than focusing on concentration contrasts, the primary objective of this work is to establish this baseline understanding of sensor availability and temporal continuity, which is a necessary precursor for interpreting pollution patterns captured by these networks. As an illustrative component, we present PM_2.5_ summaries for selected states with high sensor density to demonstrate the types of analyses enabled by such deployment patterns. These summaries are intended to be descriptive and illustrative, rather than comprehensive exposure assessments, and are included to contextualize the deployment findings and inform future, dedicated air quality analyses using low-cost air quality sensor data.
2. Methods
2.1. Data Source and Fields
We obtained publicly available PurpleAir PM_2.5_ data via the PurpleAirAPI package in R. For each sensor, we extracted latitude, longitude, date_created, last_seen, location_type, humidity, temperature, pressure, and pm2.5_cf_1 (Table 1). We restricted the dataset to outdoor sensors (based on location_type) with valid geocoordinates.
To improve comparability across locations, we analyzed data from 2021 to 2024 (a period with broader network penetration).
2.2. Deployment Longevity Metric and Categorization
Our objective was to quantify operational deployment tenure—i.e., how long PurpleAir sensors remain deployed and discoverable on the public network—to assess suitability for multi-year use cases. We defined deployment duration for each outdoor device as Δt = last_seen−date_created from the PurpleAir API and classified sensors into <1 year, 1–2 years, 2–3 years, and >3 years categories. By design, this metric reflects the calendar data-presence window (first-to-last valid observation) and not verified continuous operation or sensor health; we did not infer hourly duty cycle, fill gaps, or relink moved/replaced devices. Results should therefore be interpreted as deployment tenure on the public network.
2.3. Regional Stratification and State-Level PM2.5 Characterization
We summarized deployment across U.S. Census regions (Northeast, South, Midwest, West) and by state. For illustrative PM_2.5_ characterization, we examined one state per region—the state with the largest number of outdoor sensors—to ensure sufficient sample size. This design provides high-coverage exemplars rather than region-wide representativeness; therefore, state-level PM_2.5_ summaries are presented descriptively, and we refrain from generalizing those distributions to entire regions. Our national conclusions about deployment longevity are based on the full dataset across all states.
To ensure data quality, we removed negative PM2.5 observations and values > 100 µg/m^3^; concentrations above ~100 µg/m^3^ in the U.S. are typically wildfire-smoke episodes rather than baseline ambient conditions (many locations now experience at least one such day per year), and our deployment-focused comparisons therefore target typical ambient ranges rather than event-driven extremes. Additionally, very high smoke concentrations fall in known nonlinear response regimes for PurpleAir sensors, which would require specialized treatment beyond this study’s scope [12,13].
PM_2.5_ concentrations were corrected using the U.S.-wide calibration equation developed by Barkjohn et al., 2021 [6]:
To classify monitoring sites as urban or rural, we used the U.S. Census Bureau’s 2018 Urbanized Areas shapefile. This dataset delineates urbanized areas (population ≥ 50,000) and urban clusters (population 2500–49,999) at a national scale. We imported the shapefile into R and spatially joined it with sensor coordinates. Sensors located within an urbanized area or urban cluster polygon were categorized as “urban,” while those outside these boundaries were classified as “rural.”
3. Results
3.1. Analytic Sample
Publicly available data spanning 2016–2025 were initially downloaded for the U.S. The raw dataset contained n = 28,267 observations. After excluding sensors lacking location information, the dataset comprised n = 28,240 observations. Restricting the dataset to outdoor sensors reduced the total to n = 21,592. After removing sensors with missing geocoordinate information, the sample size was n = 17,511. Finally, removing observations for Puerto Rico and U.S. Virgin Islands yielded a final analytic dataset of n = 17,474 observations.
3.2. Geographic Distribution of PM2.5 Sensors
Figure 1 displays spatial distribution of publicly shared PurpleAir PM_2.5_ sensors (purple points) and U.S. EPA Federal Reference Method (FRM)/Federal Equivalent Method (FEM) PM_2.5_ monitors (blue triangles) across the United States. The figure highlights the dense, crowdsourced coverage of PurpleAir contrasted with the sparser regulatory-grade FRM/FEM network. This map provides context for our deployment-longevity analysis and the state-level illustrative comparisons presented later in the manuscript.
3.3. Deployment Longevity by Region and State
Sensor deployment longevity varied substantially across U.S. Census regions between 2016 and 2025 (Figure 2). The West region exhibited by far the highest number of PurpleAir sensors, exceeding 12,000 units, with the majority deployed for more than three years. In contrast, the Northeast and Midwest regions had substantially lower sensor counts—each with fewer than 2000 sensors overall—and comparatively more even distributions across longevity categories. The South showed slightly higher deployment than the Northeast and Midwest, with a noticeable share of sensors operating beyond three years but still far fewer than in the West. Across all regions, sensors deployed for more than three years represented the largest longevity category, while short-term deployments lasting less than one year were consistently the smallest group. These patterns indicate that long-term sensor retention is common nationwide, particularly in the West, potentially reflecting stronger adoption, sustained maintenance, and higher levels of community engagement with air quality monitoring in that region.
Table 2 summarizes sensor distribution across U.S. states by deployment duration. Nationwide, most sensors have been active for more than three years (n = 10,347; 59.2%), followed by deployments of 1–2 years (n = 2627; 15.0%), 2–3 years (n = 2491; 14.3%), and less than one year (n = 2009; 11.5%). California, Washington, and Oregon account for the largest sensor counts (n = 7138; 1604; and 1064, respectively), with a strong majority in each exceeding three years of operation. In contrast, states such as Mississippi (n = 10), Rhode Island (n = 29), and South Dakota (n = 23) have the fewest sensors, with deployments more evenly distributed across categories. Overall, sensor coverage and longevity vary widely, highlighting regions with extensive long-term monitoring alongside areas with limited adoption.
Sensor longevity patterns differ markedly by region (Figure 3). In the Northeast, Pennsylvania leads with over 700 sensors, most deployed for more than three years, while Massachusetts and New York also show high counts dominated by long-term installations. Vermont, New Hampshire, and Rhode Island have far fewer sensors and shorter deployments. In the Midwest, counts range from sparse networks in South Dakota and North Dakota to larger deployments in Illinois, Ohio, Michigan, and Minnesota, where long-term sensors predominate. The South exhibits moderate adoption with substantial variation: Texas and North Carolina have the highest counts, followed by Florida and Virginia, while many states maintain small networks with few long-duration sensors. The West shows the greatest disparity, driven by California’s more than 7000 sensors—most active for over three years—alongside sizable networks in Washington and Oregon and smaller deployments elsewhere. Across all regions, sensors deployed for more than three years consistently represent the largest category, indicating strong retention once installed.
3.4. Illustrative PM2.5 Summaries by Deployment Duration and Urban–Rural Classification in Four Exemplar States
Mean PM_2.5_ concentrations varied considerably by sensor deployment duration, though patterns differed across states (Table 3). California exhibited relatively consistent concentrations across all deployment durations (6.25–7.58 µg/m^3^), with sensors operating <1 year recording the highest mean (7.58 µg/m^3^) and maximum (23.7 µg/m^3^) values. Minnesota showed a different pattern, with sensors operating for 2–3 years exhibiting the highest mean concentration (8.01 µg/m^3^) compared to both shorter (<1 year: 6.99 µg/m^3^) and longer (>3 years: 7.61 µg/m^3^) deployment durations.
Pennsylvania indicataed a clear decreasing trend, with mean concentrations declining from 9.73 µg/m^3^ in sensors operating <1 year to 6.57 µg/m^3^ in those operating >3 years—a 32% reduction. Texas exhibited both the highest absolute PM_2.5_ levels and the most pronounced duration-related pattern among all states analyzed. Sensors operating <1 year recorded markedly elevated concentrations (mean: 12.7 µg/m^3^; range: 10.1–18.2 µg/m^3^), nearly double the mean observed in sensors operating >3 years (6.39 µg/m^3^). This substantial temporal variability likely reflects either deployment in response to episodic pollution events or capture of localized emission sources during the monitoring period.
Urban–rural differences in annual average PM_2.5_ concentrations revealed distinct state-specific patterns (Table 4). California and Minnesota showed relatively modest urban–rural differences in mean concentrations. In California, urban sensors recorded slightly higher means than rural sensors (6.91 vs. 6.25 µg/m^3^), with urban sites also exhibiting higher maximum values (16.80 vs. 14.90 µg/m^3^). Minnesota displayed similar mean concentrations between urban and rural locations (7.30 vs. 6.99 µg/m^3^), though urban sites recorded substantially higher maximum values (10.60 vs. 8.08 µg/m^3^).
In contrast, Pennsylvania and Texas exhibited elevated rural PM_2.5_ concentrations. Pennsylvania’s rural sensors reported a mean of 9.73 µg/m^3^ compared to 7.84 µg/m^3^ in urban areas, a 24% difference suggesting potential localized sources outside major metropolitan areas. Texas displayed the most pronounced urban–rural disparity among all states examined: rural sensors averaged 12.70 µg/m^3^—62% higher than the urban mean of 7.86 µg/m^3^—and reached a maximum of 18.20 µg/m^3^. The consistently elevated rural concentrations in Texas, combined with the narrow range observed in urban areas (4.28–10.20 µg/m^3^), suggest persistent rural pollution sources distinct from typical urban emission patterns.
We explored the feasibility of comparing PurpleAir sensor data with EPA monitoring data by identifying sensors that had collected data for at least three years with a completion rate above 75%, a threshold chosen to meet U.S. EPA data completeness standards—meaning the sensor recorded at least 75% of its expected hourly or daily measurements over the period—ensuring that the PurpleAir data could be reliably compared with reference monitors for regulatory purposes [14]. However, only a very small number of sensors met this criterion, limiting our ability to perform a meaningful long-term comparison. This scarcity likely reflects several factors: the voluntary nature of PurpleAir deployments, which often results in intermittent operation or sensor removal; lack of standardized maintenance protocols; and environmental conditions that can cause sensor downtime or data gaps [15,16]. Maintaining high data completeness over multi-year periods is inherently challenging for community-based sensor networks. Wallace et al. [17] noted that strict quality assurance measures substantially reduced usable data in their study. Unlike regulatory monitors, which operate under standardized QA/QC protocols, PurpleAir sensors are privately owned and may be designated as private, lack formal maintenance schedules, and are not subject to oversight, making sustained multi-year completeness difficult [17]. These limitations underscore the need for structured maintenance guidance and automated alerts to improve long-term data continuity.
As an explanatory analysis, we compared daily mean PM_2.5_ from one PurpleAir sensor with an EPA regulatory monitor located ≈100 m away near an airport and observed a positive but weak association (Pearson r = 0.231). The divergence likely reflects a combination of micro-environmental differences (the sites were near an airport with rapidly changing emissions and wind/turbulence), micro-siting/placement effects over tens of meters, and methodological differences between an optical, humidity-sensitive sensor (after correction) and a reference-grade instrument. Because this analysis is based on a single qualifying pair and does not stratify by wind sector or event conditions, it should be viewed as an illustrative demonstration of hyper-local variability.
4. Discussion
To guide interpretation and avoid overstating scope, we distinguish descriptive findings—network deployment patterns, spatial coverage, and sensor longevity—from hypothesis-generating implications for exposure assessment or policy. Descriptive results indicate where and how long sensors operate; they are not causal or regulatory claims. Any policy-relevant implication would require additional validation (collocation, completeness thresholds, event stratification, and standardized calibration workflows).
4.1. Key Insights
Our analysis answers a core question: Do publicly shared PurpleAir sensors remain deployed and discoverable long enough—and where—to support multi-year use cases? Nationally, we find a large, persistent community network: many devices remain online for more than three years, suggesting that multi-year applications are feasible in substantial parts of the country. At the same time, coverage is uneven. The West—especially California, Washington, and Oregon—dominates long-duration deployments, whereas many states in the Northeast, Midwest, and South have sparse networks. These patterns underscore persistent gaps in low-cost sensor availability, which may limit air quality monitoring in rural or underserved areas and constrain the representativeness of exposure data that can be used for epidemiological research [11,18]. Duration-stratified summaries in four exemplar states (CA, MN, PA, TX) also show that short-duration sensors often capture higher means or extremes, consistent with event-driven or opportunistic installations; thus, interpreting concentration summaries likely would require attention to deployment timing and context. Urban–rural contrasts are state-specific: in some states, rural areas exhibit elevated PM2.5 relative to urban areas, suggesting the value of non-metro coverage.
Elevated rural PM_2.5_ levels in Texas and Pennsylvania underscore the need for expanded sensor coverage beyond metropolitan areas. However, these patterns should be interpreted cautiously: the present analysis does not disentangle true environmental signals from deployment bias (e.g., local event-driven or seasonal placements) or sensor-related artifacts (including correction performance under smoke/high humidity). Accordingly, we do not attempt source attribution here; careful interpretation that acknowledges deployment context, siting, and methodological nuances is essential before drawing strong conclusions based on our exploratory analyses.
What this study does is establish a national baseline of where publicly shared sensors exist and how long they persist on the public network, thereby identifying regions and states that already support multi-year analyses and those where persistence or coverage remains a limiting factor. What it does not do is assert regulatory comparability, causal attribution, or comprehensive exposure modeling. Those aims need collocated reference data, completeness thresholds, event stratification, and standardized calibration workflows that are beyond the remit of a deployment-focused study.
An exploratory comparison between a single PurpleAir sensor and a nearby EPA monitor (~100 m) yielded a positive but weak association (r = 0.231). This example illustrates a central trade-off: hyper-local sensors can detect fine-scale variability that a single regulatory monitor cannot resolve, yet that same granularity complicates one-to-one alignment without careful siting, wind-sector/event stratification, and method harmonization. The example should therefore be viewed as a demonstration of hyper-local variability, not as evidence for or against regulatory comparability.
4.2. Limitations
Several limitations should be acknowledged. First, our deployment metric (first-seen to last-seen) captures network tenure, not verified continuous operation; intermittent downtime can overstate availability. Second, even with widely used corrections, extreme smoke or dust conditions can introduce nonlinear response, so our QC emphasizes baseline ranges rather than extremes. Measurement biases related to environmental conditions may persist despite applying correction algorithms, as these models often fail to account for extreme pollution events or unique atmospheric conditions. For example, the EPA correction improves agreement with regulatory monitors for urban and smoke aerosols but underestimates PM2.5 during dust events and at very high concentrations (>600 µg/m^3^) [15]. Recent studies propose localized or event-specific calibration approaches to address these gaps [17,19], yet a systematic framework for applying such methods across diverse regions remains lacking. This underscores the need for geographically tailored correction models to ensure reliable data for health research and policy applications. Third, the four state exemplars are high-coverage cases used for illustration, not region-wide generalizations.
4.3. Future Opportunities
Although developing a nationwide framework to integrate sensor data into exposure models remains challenging, emerging evidence shows that low-cost networks such as PurpleAir can enhance epidemiological exposure assessment when properly calibrated and incorporated into spatiotemporal models. Bi et al. [20] used publicly available PurpleAir PM2.5 measurements alongside regulatory “gold-standard” monitors to improve exposure predictions for participants in the Adult Changes in Thought–Air Pollution (ACT-AP) cohort. Incorporating calibrated PurpleAir data into exposure models increased external validation performance (higher R^2^ and lower RMSE) and revealed sharper spatial gradients, enhancing representation of fine-scale variability in PM2.5 relevant to health studies. The authors also proposed metrics to assess the representativeness of sensor locations relative to cohort residences, underscoring the importance of thoughtful deployment to avoid spatial bias [20]. Similarly, Coker et al. [21] employed a PurpleAir sensor in Rio Branco, Brazil, to examine associations between PM2.5 and daily respiratory hospitalizations. Corrected sensor data produced stronger, more accurate effect estimates than uncorrected readings, demonstrating that calibrated low-cost sensors can support epidemiological analyses in regions lacking regulatory monitors [21]. Collectively, these findings highlight the potential of PurpleAir data to enhance high-resolution exposure models, provided robust calibration and validation strategies are applied.
Here are some next steps for consideration. First, prioritize targeted augmentation and collocation in low-coverage states to improve both spatial equity and calibration. Second, adopt standardized platform-side curation (plausibility checks, completeness/persistence metrics, and event tagging) with published confidence labels to facilitate downstream use. Third, machine learning can be used to further utilize the low-cost sensor data. For example, Lu and colleagues developed a high-resolution (500 × 500 m) model of hourly PM2.5 concentrations in Los Angeles County by integrating quality-controlled PurpleAir low-cost sensor data with spatial predictors in a machine-learning framework. The model showed strong predictive performance (cross-validated R^2^ up to 0.93) and captured fine-scale spatial and temporal variability, including wildfire episodes, demonstrating the potential value of calibrated low-cost sensors for exposure assessment in environmental health studies [22]. Additionally, emerging ML techniques offer promise for improving calibration by capturing nonlinear relationships between environmental variables and pollutant concentrations, but these approaches remain non-standardized and their transferability across regions is uncertain [23].
Finally, clarifying roles helps improve utility. As citizen-science use grows, guidance for non-experts (e.g., outdoor placement, power/connectivity, weather protection) is helpful, but non-professional users would not be the primary stewards of research- or policy-relevant deployment. Their reasonable expectation may be to use convenient, accessible devices; accordingly, producing decision-useful data should rest chiefly with device developers, data platforms, and sponsoring programs. In practice, low-cost sensors deliver the most value as a supplement to the regulatory network when curated by researchers or agencies under pre-specified QA/QC—automated plausibility checks, minimum completeness/persistence thresholds, event tagging (e.g., smoke/dust), vetted corrections, and clear uncertainty labels. This system-focused approach leverages high spatial density for hyper-local screening, siting reconnaissance, and short-duration decision support, while FRM/FEM monitors remain the foundation for compliance and long-term trend assessment—thereby improving utility and reliability without shifting obligations onto casual users.
In conclusion, publicly shared PurpleAir sensors are numerous and often persistent, yet coverage remains uneven, creating predictable gaps in representativeness. By answering where and how long community sensors operate, this study provides the baseline evidence needed to plan multi-year analyses and targeted expansions. When paired with standardized QA/QC and collocation, curated low-cost data can supplement regulatory networks by adding hyper-local detail, while FRM/FEM monitors continue to anchor regulatory compliance and long-term trend assessment.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Giordano M.R. Malings C. Pandis S.N. Presto A.A. Mc Neill V.F. Westervelt D.M. Beekmann M. Subramanian R. From low-cost sensors to high-quality data: A summary of challenges and best practices for effectively calibrating low-cost particulate matter mass sensors J. Aerosol Sci.202115810583310.1016/j.jaerosci.2021.105833 · doi ↗
- 2Malings C. Tanzer R. Hauryliuk A. Saha P.K. Robinson A.L. Presto A.A. Subramanian R. Fine particle mass monitoring with low-cost sensors: Corrections and long-term performance evaluation Aerosol Sci. Technol.20205416017410.1080/02786826.2019.1623863 · doi ↗
- 3US EPA EPA Research Improves Air Quality Information for the Public on the Air Now Fire and Smoke Map [Overviews and Factsheets]2022 Available online: https://www.epa.gov/sciencematters/epa-research-improves-air-quality-information-public-airnow-fire-and-smoke-map(accessed on 31 December 2025)
- 4Keyes T. Domingo R. Dynowski S. Graves R. Klein M. Leonard M. Pilgrim J. Sanchirico A. Trinkaus K. Low-cost PM 2.5 sensors can help identify driving factors of poor air quality and benefit communities Heliyon 20239 e 1987610.1016/j.heliyon.2023.e 1987637809584 PMC 10559280 · doi ↗ · pubmed ↗
- 5Air Now Fire and Smoke Map Available online: https://fire.airnow.gov/(accessed on 31 December 2025)
- 6Barkjohn K.K. Gantt B. Clements A.L. Development and application of a United States-wide correction for PM 2.5 data collected with the Purple Air sensor Atmos. Meas. Tech.2021144617463710.5194/amt-14-4617-202134504625 PMC 8422884 · doi ↗ · pubmed ↗
- 7Bagkis E. Hassani A. Schneider P. De Souza P. Shetty S. Kassandros T. Salamalikis V. Castell N. Karatzas K. Ahlawat A. Evolving trends in application of low-cost air quality sensor networks: Challenges and future directions Npj Clim. Atmos. Sci.2025833510.1038/s 41612-025-01216-4 · doi ↗
- 8US Government Accountability Office Air Quality Sensors: Policy Options to Help Address Implementation Challenges 19March 2024 Available online: https://www.gao.gov/products/gao-24-106393(accessed on 31 December 2025)
