# Spatial Clustering for Carolina Breast Cancer Study

**Authors:** Hongqian Niu, Melissa Troester, Didong Li

PMC · DOI: 10.1142/9789819807024_0025 · Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing · 2026-01-03

## TL;DR

This paper introduces a new spatial clustering method to study how geography and demographics affect breast cancer risk in North Carolina.

## Contribution

The paper introduces GPSC, a novel spatial clustering algorithm using Gaussian Processes for analyzing geospatial health data.

## Key findings

- GPSC provides theoretical guarantees and successfully recovers true clusters in empirical studies.
- The method identifies census tract clusters in North Carolina based on socioeconomic and environmental factors linked to cancer risk.

## Abstract

In the Carolina Breast Cancer Study (CBCS), clustering census tracts based on spatial location, demographic variables, and socioeconomic status is crucial for understanding how these factors influence health outcomes and cancer risk. This task, known as spatial clustering, involves identifying clusters of similar locations by considering both geographic and characteristic patterns. While standard clustering methods such as K-means, spectral clustering, and hierarchical clustering are well-studied, spatial clustering is less explored due to the inherent differences between spatial domains and their corresponding covariates. In this paper, we introduce a spatial clustering algorithm called Gaussian Process Spatial Clustering (GPSC). GPSC leverages the flexibility of Gaussian Processes to cluster unobserved functions between different domains, extending traditional clustering techniques to effectively handle geospatial data. We provide theoretical guarantees for GPSC’s performance and demonstrate its capability to recover true clusters through several empirical studies. Specifically, we identify clusters of census tracts in North Carolina based on socioeconomic and environmental indicators associated with health and cancer risk.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** cancer (MESH:D009369), Breast Cancer (MESH:D001943)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12764386/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12764386/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/PMC12764386/full.md

---
Source: https://tomesphere.com/paper/PMC12764386