Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation

Dilermando Almeida; Guilherme Lazzarini; Juliano Negri; Thiago H. Segreto; Ricardo V. Godoy; Marcelo Becker

arXiv:2508.17466·cs.RO·May 6, 2026

Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation

Dilermando Almeida, Guilherme Lazzarini, Juliano Negri, Thiago H. Segreto, Ricardo V. Godoy, Marcelo Becker

PDF

TL;DR

This paper introduces a deep learning framework for quadruped robots that improves grasping precision and adaptability through a sim-to-real approach, utilizing synthetic data and multi-modal sensor inputs.

Contribution

It develops a novel pipeline that combines synthetic dataset generation, a multi-modal CNN model, and real-world validation for enhanced robot grasping capabilities.

Findings

01

Successful autonomous loco-manipulation on a quadruped robot

02

Effective sim-to-real transfer using synthetic grasp datasets

03

Model processes multi-modal sensor data to identify optimal grasp points

Abstract

This paper presents a deep learning framework designed to enhance the grasping capabilities of quadrupeds equipped with arms, with a focus on improving precision and adaptability. Our approach centers on a sim-to-real methodology that minimizes reliance on physical data collection. We developed a pipeline within the Genesis simulation environment to generate a synthetic dataset of grasp attempts on common objects. By simulating thousands of interactions from various perspectives, we created pixel-wise annotated grasp-quality maps to serve as the ground truth for our model. This dataset was used to train a custom CNN with a U-Net-like architecture that processes multi-modal input from an onboard RGB and depth cameras, including RGB images, depth maps, segmentation masks, and surface normal maps. The trained model outputs a grasp-quality heatmap to identify the optimal grasp point. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.