# Soft Optical Sensor for Embryo Quality Evaluation Based on Multi-Focal Image Fusion and RAG-Enhanced Vision Transformers

**Authors:** Domas Jonaitis, Vidas Raudonis, Egle Drejeriene, Agne Kozlovskaja-Gumbriene, Andres Salumets

PMC · DOI: 10.3390/s26051441 · Sensors (Basel, Switzerland) · 2026-02-25

## TL;DR

A new soft optical sensor improves embryo quality evaluation by combining multi-focal imaging and AI, reducing human bias and increasing accuracy.

## Contribution

A novel soft optical sensor using multi-focal image fusion and RAG-enhanced vision transformers for embryo classification.

## Key findings

- Multi-focal image fusion improves embryo classification accuracy by 9.43% compared to single-plane microscopy.
- The Swin-Transformer Soft Sensor achieves 94.11% diagnostic accuracy on a dataset of 102,308 clinical images.
- RAG integration provides explainable AI rationales based on ESHRE guidelines, improving clinician trust.

## Abstract

What are the main findings?
Multi-focal image fusion increases embryo classification accuracy by 9.43% compared to standard single-plane microscopy, effectively recovering lost morphological details.The proposed Swin-Transformer Soft Sensor achieves 94.11% diagnostic accuracy on a large-scale clinical dataset (N = 102,308).

Multi-focal image fusion increases embryo classification accuracy by 9.43% compared to standard single-plane microscopy, effectively recovering lost morphological details.

The proposed Swin-Transformer Soft Sensor achieves 94.11% diagnostic accuracy on a large-scale clinical dataset (N = 102,308).

What are the implications of the main findings?
Automated Z-stack fusion eliminates the need for subjective manual focusing, significantly reducing inter-observer variability in IVF laboratories.Integration of Retrieval-Augmented Generation (RAG) bridges the “trust gap” in medical AI by providing clinicians with verifiable, text-based rationales grounded in ESHRE consensus guidelines.

Automated Z-stack fusion eliminates the need for subjective manual focusing, significantly reducing inter-observer variability in IVF laboratories.

Integration of Retrieval-Augmented Generation (RAG) bridges the “trust gap” in medical AI by providing clinicians with verifiable, text-based rationales grounded in ESHRE consensus guidelines.

Assessing human embryo quality is a critical step in in vitro fertilization (IVF), yet traditional manual grading remains subjective and physically limited by the shallow depth-of-field in conventional microscopy. This study develops a novel “soft optical sensor” architecture that transforms standard optical microscopy into an automated, high-precision instrument for embryo quality assessment. The proposed system integrates two key computational innovations: (1) a multi-focal image fusion module that reconstructs lost morphological details from Z-stack focal planes, effectively creating a 3D-aware representation from 2D inputs; and (2) a retrieval-augmented generation (RAG) framework coupled with a Swin Transformer to provide both high-accuracy classification and explainable clinical rationales. Validated on a large-scale clinical dataset of 102,308 images (prior to augmentation), the system achieves a diagnostic accuracy of 94.11%. This performance surpasses standard single-plane analysis methods by 9.43%, demonstrating the critical importance of fusing multi-focal data. Furthermore, the RAG module successfully grounds model predictions in standard ESHRE consensus guidelines, generating natural language explanations. The results demonstrate that this soft sensor approach significantly reduces inter-observer variability and offers a robust tool for standardized morphological assessment, though prospective validation against live birth outcomes remains essential for clinical adoption.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12987002/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12987002/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC12987002/full.md

---
Source: https://tomesphere.com/paper/PMC12987002