Manually classified dataset of leaning and standing personnel images for construction site monitoring and neural network training
Alexandre Almeida Del Savio, Ana Luna Torres, Daniel Cárdenas-Salas, Mónica Vergara Olivera, Gianella Urday Ibarra

TL;DR
This paper introduces a labeled dataset of construction site images showing workers either standing or leaning, to improve AI models for monitoring safety and activities.
Contribution
The novel contribution is a manually labeled dataset with bounding boxes for training and validating AI models in construction site monitoring.
Findings
The dataset includes 1,214 images with two classes: standing and leaning personnel.
Bounding box coordinates are provided for each image to support precise neural network training.
The dataset is publicly available for academic and industrial use in computer vision and safety monitoring.
Abstract
This data paper presents a manually labeled dataset of 1,214 images of personnel captured from a construction site using four static cameras. There are two classes, standing and people leaning. The classification is stored in accompanying text files and bounding box coordinates for every image. The compilation was done to support the developing and validation computer vision and AI models for construction site monitoring. This dataset addresses the challenges of finding personnel in different poses within complex construction environments. The resource will enhance construction site safety monitoring and personnel activity analysis by allowing more precise neural network training. The dataset is stored in a public repository, making it openly accessible for academic and industrial purposes regarding computer vision, civil engineering, and workplace safety.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrastructure Maintenance and Monitoring · Tunneling and Rock Mechanics · Occupational Health and Safety Research
Specifications TableSubjectComputer Science, EngineeringSpecific subject areaArtificial Intelligence, Computer Science Applications, Computer Vision and Pattern Recognition, Civil EngineeringType of dataRaw.Images .jpgArchives .txtData collectionThe data were acquired from four static cameras around a construction site: one bullet-type camera, model IPC-HFW2831T-ZS-S2 [1], and three motorized IP PTZ cameras, model SD10A848WA-HNF [2]. The cameras were programmed using the DSSExpress-Base-License software [3]. The LabelImg v.1.8.1 tool [4] was used to classify images.Data source locationAll images were acquired from a building under construction at the Universidad de Lima, Lima, Peru.Lat. -12.084307°, Long. -76.971031°Data accessibilityThe data is hosted on a public and trusted repository.Repository name: Repositorio Institucional – Universidad de LimaData identification number: Doesn't haveDirect URL to data: https://doi.org/10.26439/ulima.datasets.19853Instructions for accessing these data: The data is open access, anonymity is not compromised. The link to download the data set is presented in the section “Recurso(s) relacionado(s)”.Related research articleAlmeida Del Savio, A., Luna Torres, A., Cárdenas-Salas, D., Vergara Olivera, M.A., Urday Ibarra, G.T. (2023). Artificial Intelligence Applied to the Control and Monitoring of Construction Site Personnel. In: dell'Isola, F., Barchiesi, E., León Trujillo, F.J. (eds) Advances in Mechanics of Materials for Environmental and Civil Engineering. Advanced Structured Materials, vol 197. Springer, Cham. https://doi.org/10.1007/978-3-031-37101-1_2 [6]
Value of the Data
1
- •Unique Dataset: This dataset consists of 1,214 high-resolution, manually classified images of construction personnel standing and leaning and addresses challenges in complex environments, such as variable lighting and occlusions. It can be challenging to obtain information from different activities at construction sites. These images can be of help to future researchers as a database of construction personnel and their activities to train their models or use along their databases.
- •Improves Neural Network Training: Future research can use this data as a way to facilitate the development of accurate computer vision models for safety monitoring and activity analysis in construction sites.
- •Interdisciplinary Applications: This is useful for research in civil engineering, AI, computer vision, and workplace safety, bridging technology and practical safety solutions.
- •Educational and Industrial Relevance: Open access for academic learning and industrial innovation in safety compliance and ergonomic design. This information is accessible for other researchers for future projects regarding computer vision and pattern recognition.
Background
2
The classification of personnel within construction environments is a cardinal activity in the development of safety monitoring applications using computer vision. Del Savio et al. [7,8] targeted identifying “person” class within a construction site. However, most of these early methodologies were limited to precision, especially when individuals were captured running in diverse postures or different distances from cameras.
Building on these findings, Almeida Del Savio et al. [6] developed an improved methodology that categorized personnel into two classes: standing and leaning. The motivation for this is because construction sites are highly dynamic, with workers continuously changing poses depending on what they are doing. Given this subtle classification, the enhanced methodology greatly improved object detection and activity recognition precision.
This paper presents a dataset stemming from this extended methodology, tackling the challenges in the real construction environment. Issues such as cluttered backgrounds, changing light conditions, and high-resolution image analysis are resolved with care, guaranteeing the usability of this dataset in training neural networks. Integrating bounding box annotation also facilitates accurate personnel localization within the images for robust training and validation of computer vision models.
Data Description
3
The images were captured at the construction site of the Wellness Center building at the Universidad de Lima. 1,214 high-resolution images were extracted from footage recorded by four strategically placed static cameras (Fig. 1), ensuring comprehensive site coverage. This dataset was utilized by Almeida Del Savio et al. (2023) [6] to train artificial intelligence models for object detection in construction environments. The classification focused on two key categories of personnel: standing individuals (“person” [6,7]) and leaning individuals (“leaning_person”), as outlined in Table 1.Fig. 1. Construction site location [8].Fig 1. Table 1Classes used for manual classification.Table 1IDClassObject0“leaning_person”Image, table 11“person”Image, table 1
Table 1 shows the classes used for manual classification and their respective images. These elements are found in the classes.txt file.
In the first column of Table 2, the original image displayed is the output of a frame extraction algorithm presented in [5], applied to video footage from the surveillance cameras. These images are in .jpg format with a 3840 × 2160 pixels resolution. The second column illustrates the manual classification process using the LabelImg v.1.8.1 software [4]. Objects in the images were annotated and categorized into their respective classes, with bounding boxes marked in green quadrilaterals. The results are in the third column, represented as .txt files containing structured metadata. This metadata includes the class ID (refer to Table 1) in the first column, the X and Y coordinates of the bounding box's top-left corner in the second and third columns, and the width and height of the bounding box in the fourth and fifth columns, respectively.Table 2. Examples of selected images before and after manual classification, including the resulting .txt archive.Table 2. Image, table 2
Fig. 2 shows an extracted area from IMG-52.jpg (Table 2), where the objects are linked to their respective IDs according to Table 1.Fig. 2. Extract of IMG-52.jpg's objects with their respective IDs.Fig 2
Experimental Design, Materials and Methods
4
Experimental design
4.1
The research aims to develop a high-resolution image dataset for construction site monitoring and neural network training. The methodological design incorporates the following elements:
- •Approach: A quantitative approach was adopted, leveraging numerical data and systematic classification methods to create a structured dataset [9].
- •Purpose: This study fulfills a descriptive purpose [9] to document and classify personnel activities (standing and leaning poses) within a dynamic construction site environment. This serves as foundational data for neural network training.
- •Design: A non-experimental design [9] was implemented. Images were collected passively without manipulating variables, ensuring authenticity in capturing real-world construction activities.
Materials
4.2
Cameras and Software:
Hardware: Four static cameras were employed:
- •One bullet-type camera, model IPC-HFW2831T-ZS-S2 [1].
- •Three motorized IP PTZ cameras, model SD10A848WA-HNF [2].
Software:
- •Video footage was managed using DSS Express software [3].
- •Frame extraction algorithm [5].
- •Images were manually annotated with LabelImg v.1.8.1 software [4].
Data Specifications:
- •Image format: .jpg
- •Resolution: 3840 × 2160 pixels
- •Metadata: Bounding box annotations stored in .txt files.
Methods
4.3
-
- Data Collection
Images were captured from November 2020 to March 2021 at the construction site of the Wellness Center building at the Universidad de Lima (Lat. -12.084307°, Long. -76.971031°). Cameras were strategically positioned to ensure comprehensive coverage. Recording schedules avoided midday to minimize glare and ensure image clarity.
-
•Video Footage: Footage from each camera was recorded in 10-minute segments, with a frame extraction interval of 200 frames. Table 3 summarizes the video durations and extracted frames.Table 3. Video footage duration and number of extracted frames per segment.Table 3. Video footage IDDuration (mm: ss)Extracted framesVIDEO1.mp410:0045VIDEO2.mp410:0045VIDEO3.mp410:0091VIDEO4.mp410:0090VIDEO5.mp410:0090VIDEO6.mp410:0090VIDEO7.mp410:0089VIDEO8.mp410:0090VIDEO9.mp410:0044VIDEO10.mp410:0090VIDEO11.mp410:0090VIDEO12.mp410:0090VIDEO13.mp410:0090VIDEO14.mp410:0090VIDEO15.mp410:0045VIDEO16.mp410:0045
-
- Manual Annotation
Using the LabelImg v.1.8.1 software [4], personnel were classified into two categories:
Each object was annotated with a bounding box, and the metadata was saved in corresponding .txt files, which include:
-
•Class ID
-
•Bounding box coordinates (X and Y for the upper-left corner, width, and height).
-
- Environmental Conditions
Table 4 shows the date and time according to the ID of the image. Data collection accounted for varying lighting and temperature conditions, as shown in Table 5. Illuminance levels ranged from 37,000 lx to 90,000 lx, and temperatures ranged from 18°C to 26°C.
- 4. Validation Table 4. Date and time allocation of image frames.Table 4. Image IDDate (MM, DD, YYYY)Time (24 hrs.)IMG-1 – IMG-45November 30, 202011:10 – 11:19IMG-46 – IMG-135November 18, 202013:29 – 13:39IMG-136 – IMG-225January 06, 202111:39 – 11:49IMG-226 – IMG-315January 06, 202111:39 – 11:49IMG-316 – IMG-405December 07, 202011:09 – 11:19IMG-406 – IMG-495November 20, 202011:29 – 11:39IMG-496 – IMG-540January 04, 202115:30 – 15:39IMG-541 – IMG-585January 06, 202115:39 – 15:49IMG-586 – IMG-630November 18, 202013:29 – 13:39IMG-631 – IMG-721December 07, 202011:09 – 11:20IMG-722 – IMG-811December 15, 202010:10 – 10:20IMG-812 – IMG-901January 04, 202115:29 – 15:39IMG-902 – IMG-991March 18, 202111:30 – 11:39IMG-992 – IMG-1080January 20, 202110:59 – 11:09IMG-1081 – IMG-1170March 08, 202111:30 – 11:39IMG-1170 – IMG-1214November 30, 202011:20 – 11:29Table 5Environmental conditions during image collection (Illuminance and temperature).Table 5. Date (MM, DD, YYYY)Time (24 hrs.)Illuminance (lx)Air temperature (°C)November 18, 202013:29 – 13:396000018November 20, 202011:29 – 11:396200024November 30, 202011:10 – 11:307770021December 07, 202011:09 – 11:203700021December 15, 202010:10 – 10:208900023January 04, 202115:29 – 15:396120022January 06, 202111:39 – 11:497800024March 08, 202111:30 – 11:399000025March 18, 202111:30 – 11:398500026
The dataset was validated for completeness and consistency. Images were reviewed to ensure proper annotation and alignment with the defined classes.
Limitations
This study acknowledges the following limitations that may influence the applicability and generalizability of the dataset:
- •Fixed Camera Positions: All data was captured using fixed cameras only. Thus, the dataset is restricted to dynamic or mobile viewpoint scenarios. Further work on datasets may include mobile or drone-based captures for added versatility.
- •Restricted Classification Categories: The dataset covers two critical personnel classifications, “standing” and “leaning.” While adequate for this study's objectives, it does not consider other broad classes of posture or related activities, such as walking, sitting, or operating equipment.
- •Environmental Conditions: Data was collected at one construction site under specific environmental conditions, such as temperature and illuminance. These conditions may not represent diverse geographic or seasonal variations.
- •Manual Annotation: While the annotations were rigorously performed, manual processes are prone to human error. Automating such tasks in the future could reduce potential inconsistencies and greatly improve scalability.
- •Scope of Dataset: The dataset represents conditions and activities in a single construction site. If there are several sites, the variation in layout, equipment, and worker behavior will provide greater representation and applicability for this dataset.
These limitations provide opportunities for future work to build upon this dataset, expanding its applicability and addressing the challenges identified.
Ethics Statement
This research did not involve human subjects, animal experimentation or social media platforms.
CRediT authorship contribution statement
Alexandre Almeida Del Savio: Conceptualization, Validation, Resources, Writing – review & editing, Supervision, Project administration, Funding acquisition. Ana Luna Torres: Validation, Formal analysis, Visualization, Resources. Daniel Cárdenas-Salas: Methodology, Software, Validation, Formal analysis, Investigation. Mónica Vergara Olivera: Investigation, Data curation, Writing – original draft, Visualization. Gianella Urday Ibarra: Software, Investigation, Data curation, Writing – original draft.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Dahua Technology, 8MP Lite IR Vari-focal Bullet Network Camera. (2024) https://www.dahuasecurity.com/Products/All-Products/Network-Cameras.
- 2Dahua Technology, 4K 48x Starlight+ IR Wiz Mind Network PTZ Camera. (2024) https://www.dahuasecurity.com/Products/All-Products/PTZ-Cameras.
- 3Dahua Technology, DSS express. (2024) https://software.dahuasecurity.com/es.
- 4Label Img v.1.8.1. (2024) https://pypi.org/project/label Img/1.8.1/.
- 5Del Savio A.A.Luna Torres A.Cárdenas Salas D.Vergara Olivera M.A.Urday Ibarra G.T.Detection and evaluation of construction cracks through image analysis using computer vision Appl. Sci.13172023966210.3390/app 13179662 · doi ↗
- 6Almeida Del Savio A.Luna Torres A.Cárdenas-Salas D.Vergara Olivera M.A.Urday Ibarra G.T.Artificial intelligence applied to the control and monitoring of construction site personneldell'Isola F.Barchiesi E.León Trujillo F.J.Advances in Mechanics of Materials for Environmental and Civil Engineering. Advanced Structured Materials Advances in Mechanics of Materials for Environmental and Civil Engineering. Advanced Structured Materials 1972023 Springer Cham 10.1007/978-3-031-37101-1_2 · doi ↗
- 7Del Savio A.A.Luna Torres A.Cárdenas-Salas D.Vergara Olivera M.A.Urday Ibarra G.T.The use of artificial intelligence to identify objects in a construction site Proceedings of the International Conference on Artificial Intelligence and Energy System (ICAIES) in Virtual Mode Jaipur, India 202110.26439/ulima.prep.149332021 · doi ↗
- 8Del Savio A.Luna A.Cárdenas-Salas D.Vergara M.Urday G.Dataset of manually classified images obtained from a construction site Data Brief 42202210804210.1016/j.dib.2022.108042 PMC 893358035313499 · doi ↗ · pubmed ↗
