# Real-world chemistry lab image dataset for equipment recognition across 25 apparatus categories

**Authors:** Md Sakhawat Hossain, Md Sadman Haque, Md Mostafizur Rahman, Md Mosaddik Mashrafi Mousum, Zobaer Ibn Razzaque, Robiul Awoul Robin, Raiyan Rahman, Jannatun Noor

PMC · DOI: 10.1038/s41597-025-05952-3 · Scientific Data · 2025-11-05

## TL;DR

This paper introduces a large, real-world dataset of chemistry lab equipment images to improve automation and safety in laboratories.

## Contribution

The paper presents the most extensive publicly available dataset for recognizing 25 categories of chemistry lab apparatuses.

## Key findings

- The dataset contains 4,599 images captured under diverse real-world conditions.
- Seven state-of-the-art object detection models achieved mAP@50 scores above 0.9 on the dataset.
- The dataset is split into training, validation, and testing subsets for robust model development.

## Abstract

In modern laboratories, automation and safety rely heavily on accurately detecting and identifying laboratory equipment. To address this need, we introduce a comprehensive and well-curated dataset designed to detect 25 commonly used chemistry lab apparatuses. The dataset comprises 4,599 JPG-format images captured under diverse real-world conditions, including varying lighting, backgrounds, angles, overlaps, and distances - factors that enhance the robustness and generalizability of model training. It is split into training (70%), validation (20%), and testing (10%) subsets. This resource is particularly valuable for developing laboratory automation systems, with potential applications in safety monitoring, inventory management, and real-time tracking of lab tools. We evaluated the dataset using seven state-of-the-art object detection models, all achieving impressive performance with mAP@50 scores exceeding 0.9: RF-DETR (0.992), YOLOv11 (0.987), YOLOv9 (0.986), YOLOv5 (0.985), YOLOv8 (0.983), YOLOv7 (0.947), and YOLOv12 (0.92). To the best of our knowledge, this is the most extensive publicly available dataset of its kind, covering 25 categories of chemistry laboratory apparatuses and establishing a strong foundation for future research in laboratory automation.

## Full-text entities

- **Chemicals:** DETR (MESH:C035773)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** YOLOv11 — Homo sapiens (Human), Transformed cell line (CVCL_C1JD)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12589555/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12589555/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/PMC12589555/full.md

---
Source: https://tomesphere.com/paper/PMC12589555