# Monocular suture needle pose detection using synthetic data augmented convolutional neural network

**Authors:** Yifan Wang, Saul Alexis Heredia Perez, Kanako Harada

PMC · DOI: 10.1007/s11548-025-03467-1 · 2025-06-24

## TL;DR

This paper introduces a CNN-based method using synthetic data to estimate the position and orientation of a suture needle in robotic microsurgery under monocular vision.

## Contribution

A novel CNN-based approach that combines synthetic and real data to estimate suture needle pose in monocular images.

## Key findings

- The method achieved average translation errors of 0.107 mm, 0.118 mm, and 0.098 mm for key points in synthetic data.
- On real data, the method showed average 2D translation errors of 0.047 mm, 0.052 mm, and 0.049 mm for key points.
- 93.85% of detected keypoints had errors below 4 pixels in real-world evaluations.

## Abstract

Robotic microsurgery enhances the dexterity and stability of the surgeon to perform precise and delicate surgical procedures at microscopic level. Accurate needle pose estimation is critical for robotic micro-suturing, enabling optimized insertion trajectories and facilitating autonomous control. However, accurately estimating the pose of a needle during manipulation, particularly under monocular vision, remains a challenge. This study proposes a convolutional neural network-based method to estimate the pose of a suture needle from monocular images.

The 3D pose of the needle is estimated using keypoints information from 2D images. A convolutional neural network was trained to estimate the positions of keypoints on the needle, specifically the tip, middle and end point. A hybrid dataset comprising images from both real-world and synthetic simulated environments was developed to train the model. Subsequently, an algorithm was designed to estimate the 3D positions of these keypoints. The 2D keypoint detection and 3D orientation estimation were evaluated by translation and orientation error metrics, respectively.

Experiments conducted on synthetic data showed that the average translation error of tip point, middle point and end point being 0.107 mm, 0.118 mm and 0.098 mm, and the average orientation angular error was 12.75\documentclass[12pt]{minimal}
				\usepackage{amsmath}
				\usepackage{wasysym} 
				\usepackage{amsfonts} 
				\usepackage{amssymb} 
				\usepackage{amsbsy}
				\usepackage{mathrsfs}
				\usepackage{upgreek}
				\setlength{\oddsidemargin}{-69pt}
				\begin{document}$$^{\circ }$$\end{document}∘ for normal vector and 15.55\documentclass[12pt]{minimal}
				\usepackage{amsmath}
				\usepackage{wasysym} 
				\usepackage{amsfonts} 
				\usepackage{amssymb} 
				\usepackage{amsbsy}
				\usepackage{mathrsfs}
				\usepackage{upgreek}
				\setlength{\oddsidemargin}{-69pt}
				\begin{document}$$^{\circ }$$\end{document}∘ for direction vector. When evaluated on real data, the method demonstrated 2D translation errors averaging 0.047 mm, 0.052 mm and 0.049 mm for the respective keypoints, with 93.85% of detected keypoints having errors below 4 pixels.

This study presents a CNN-based method, augmented with synthetic images, to estimate the pose of a suture needle in monocular vision. Experimental results indicate that the method effectively estimates the 2D positions and 3D orientations of the suture needle in synthetic images. The model also shows reasonable performance with real data, highlighting its promise for real-time application in robotic microsurgery.

The online version contains supplementary material available at 10.1007/s11548-025-03467-1.

## Full-text entities

- **Diseases:** hand tremors (MESH:D014202)
- **Chemicals:** silicone (MESH:D012828)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12518471/full.md

---
Source: https://tomesphere.com/paper/PMC12518471