# Efficient Two-Stage Autofocus for Micro-Assembly Based on Joint Spatial-Frequency Image Quality Assessment

**Authors:** Jianpeng Zhang, Tianbo Kang, Xin Zhao, Mingzhu Sun, Yi Yang

PMC · DOI: 10.3390/jimaging12030137 · 2026-03-19

## TL;DR

This paper introduces a two-stage autofocus method for micro-assembly systems using a novel image quality assessment model to improve precision and automation.

## Contribution

The novel WaveMamba-IQA model combines wavelet transforms and vision transformers for improved autofocus in micro-assembly.

## Key findings

- WaveMamba-IQA achieves a Spearman correlation coefficient of 0.9786 on a custom microsphere dataset.
- The system achieves a 98.33% autofocus success rate under varying lighting conditions.
- The method improves robustness and automation in micro-assembly systems compared to traditional techniques.

## Abstract

Reliable autofocus is a fundamental prerequisite for precise positioning in micro-assembly systems, where complex reflections, scale variations, and narrow depth-of-field often degrade the robustness of traditional sharpness metrics. To address these challenges, we propose an efficient two-stage autofocus method for a dual-camera micro-vision system based on a spatial-frequency image quality assessment (IQA) model. First, we design WaveMamba-IQA for image sharpness estimation, synergistically combining the Discrete Wavelet Transform with Vision Transformers to capture high-frequency details and semantic features, further enhanced by Multi-Linear Transposed Attention and Vision Mamba for global context modeling. Moreover, we implement a coarse-to-fine autofocus workflow, employing the Covariance Matrix Adaptation Evolution Strategy for global optimization on the horizontal camera, followed by geometric prior-based precise adjustment for the oblique camera. Experimental results on a custom microsphere dataset demonstrate that WaveMamba-IQA achieves a Spearman correlation coefficient of 0.9786. Furthermore, the integrated system achieves a 98.33% autofocus success rate across varying lighting conditions. This method significantly improves the robustness and automation level of micro-assembly systems, effectively overcoming the limitations of manual and traditional focusing techniques.

## Full-text entities

- **Diseases:** PLCC (MESH:C536353), injury to (MESH:D014947), SROCC (MESH:D010300)
- **Chemicals:** MLTA (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** M58S

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13027907/full.md

---
Source: https://tomesphere.com/paper/PMC13027907