Robust Component Detection for Flexible Manufacturing: A Deep Learning Approach to Tray-Free Object Recognition under Variable Lighting

Fatemeh Sadat Daneshmand

arXiv:2507.00852·cs.CV·July 2, 2025

Robust Component Detection for Flexible Manufacturing: A Deep Learning Approach to Tray-Free Object Recognition under Variable Lighting

Fatemeh Sadat Daneshmand

PDF

Open Access 5 Reviews

TL;DR

This paper introduces a deep learning-based vision system enabling industrial robots to detect and grasp components in unstructured environments with variable lighting, eliminating the need for trays and improving manufacturing flexibility.

Contribution

It presents a novel Mask R-CNN approach for tray-free component detection that is robust to lighting variations, enhancing flexibility in industrial manufacturing.

Findings

01

95% detection accuracy under diverse lighting

02

30% reduction in setup time

03

Effective in real-world manufacturing scenarios

Abstract

Flexible manufacturing systems in Industry 4.0 require robots capable of handling objects in unstructured environments without rigid positioning constraints. This paper presents a computer vision system that enables industrial robots to detect and grasp pen components in arbitrary orientations without requiring structured trays, while maintaining robust performance under varying lighting conditions. We implement and evaluate a Mask R-CNN-based approach on a complete pen manufacturing line at ZHAW, addressing three critical challenges: object detection without positional constraints, robustness to extreme lighting variations, and reliable performance with cost-effective cameras. Our system achieves 95% detection accuracy across diverse lighting conditions while eliminating the need for structured component placement, demonstrating a 30% reduction in setup time and significant improvement…

Peer Reviews

Decision·ICLR 2026 Conference Desk Rejected Submission

Reviewer 01Rating 0Confidence 5

Strengths

1. The technical details are sound and can be reproduced. 2. The introduced vision system can work well in variable lighting conditions and bring significant engineering value.

Weaknesses

1. The author should identify an influential academic point to write this paper. I do not think tray-free object detection and pick up algorithm is still a challenging problem in industrial environments. However, difficulty in generalization ability of object detection, or in detecting high light-reflection industrial parts, etc., are well-known challenging problems. 2. The novelty of Mask R-CNN-based model should be identified and improved. 3. There are too many object detection methods, such

Reviewer 02Rating 0Confidence 4

Strengths

1. The paper is generally well written and organized. 2. The improvements over the classical method (Bagheri et al. 2020) are solid, particularly for dark conditions (Table 3). 3. The real-world implementation is commendable, and not something that you often see in ICLR submissions.

Weaknesses

My biggest concern for this paper is that it doesn't seem to be in-scope for ICLR. There are no technical developments in deep learning, machine learning, etc, as the authors are just applying an existing method, Mask R-CNN, unchanged, to a new and very niche application domain. On that note, the evaluation is quite limited, on a single small dataset, with no comparison to other learning-based detection methods (old or modern). Overall, because of the lack of any technical developments in deep

Reviewer 03Rating 0Confidence 5

Strengths

an experiment on industrial data.

Weaknesses

no deep study, no novelty.

Reviewer 04Rating 2Confidence 4

Strengths

- The paper tackles a practically relevant problem - Evaluation of the approach under various lighting conditions - The problem setting is described clearly

Weaknesses

- I am surprised to see this paper at ICLR and I am wondering whether there wouldn't be a more suitable venue - The approach seems very practical, but the scientific contribution does not become sufficiently clear - (Strong) baselines are missing (e.g., other models apart from Mask R-CNN) - The dataset seems rather small (Table 1) and its description seems inconsistent: Why are there 8 images of 87 objects? Were the column headings mixed up? In Section 3.1, it is mentioned that 22 images were us

Reviewer 05Rating 0Confidence 5

Strengths

This paper targets a concrete, real-world object detection problem, yielding promising results within the considered evaluation pipeline.

Weaknesses

1. The writing of the paper needs to be revised significantly. The paper reads like a barely finished report, not even a technical one. 2. The novelty and the contributions of the work are very limited. 3. This work could potentially contribute some insightful findings after re-designing the whole experiment and rewriting the whole paper. However, I couldn't find any useful insights from this work in its current form.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Advanced Neural Network Applications · Industrial Vision Systems and Defect Detection