# ConcreteCARB: A comprehensive image dataset of concrete carbonation for computer vision tasks

**Authors:** José A. Guzmán-Torres, Sandra del C. Arguello-Hernández, Francisco J. Domínguez-Mota, Gerardo Tinoco-Guerrero, Elia M. Alonso-Guzmán

PMC · DOI: 10.1016/j.dib.2026.112493 · Data in Brief · 2026-01-20

## TL;DR

ConcreteCARB is a dataset of concrete images labeled for carbonation detection, supporting AI applications in civil engineering diagnostics.

## Contribution

The dataset introduces a standardized, manually labeled image collection for carbonation analysis in concrete using computer vision techniques.

## Key findings

- The dataset contains 903 high-resolution images of concrete surfaces labeled as carbonated or non-carbonated.
- Images were captured using consistent methods with a Samsung smartphone under controlled conditions.
- The dataset supports AI research in concrete diagnostics and degradation analysis.

## Abstract

The ConcreteCARB dataset provides a comprehensive repository of 903 high-resolution images of concrete surfaces evaluated using the phenolphthalein test for carbonation detection. This data was collected under controlled laboratory conditions and aims to support artificial intelligence applications in civil engineering, especially in structural health monitoring tasks. The images are systematically organized into two distinct classes: “Carbonated Samples” and “No Carbonation Presence,” enabling binary classification approaches. All samples were manually tested, split, and visually labelled by expert engineers to ensure reliable ground-truth classification, in accordance with standardized procedures. The dataset includes images of concrete prism elements fabricated with varying mix designs, incorporating different water-cement ratios and additives, such as industrial silica waste and natural admixtures derived from Opuntia ficus-indica. The specimens were subjected to natural atmospheric carbonation conditions for 180 days, and their carbonation fronts were revealed by phenolphthalein staining. The samples were then split manually with a chisel and hammer, and photographic documentation was performed with a Samsung SM-S901U1 smartphone using predefined settings to ensure consistency and quality across the dataset. ConcreteCARB is intended for researchers, engineers, and data scientists working on machine learning, deep learning, and computer vision solutions for concrete diagnostics. It provides valuable training and benchmarking data for the development of automated detection, classification, and segmentation models for carbonation damage assessment. Furthermore, the dataset can serve as a foundational tool for cross-comparative studies on the efficacy of AI techniques in materials degradation analysis. The openly accessible nature of the dataset through a public repository supports reproducibility and encourages the extension of AI applications in concrete durability and sustainability studies.

## Linked entities

- **Chemicals:** phenolphthalein (PubChem CID 4764)
- **Species:** Opuntia ficus-indica (taxon 371859)

## Full-text entities

- **Diseases:** carbonation damage (MESH:D002249)
- **Chemicals:** phenolphthalein (MESH:D020113), silica (MESH:D012822)
- **Species:** Opuntia ficus-indica (Indian-fig, species) [taxon 371859]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12877836/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12877836/full.md

## References

6 references — full list in the complete paper: https://tomesphere.com/paper/PMC12877836/full.md

---
Source: https://tomesphere.com/paper/PMC12877836