ViserDex: Visual Sim-to-Real for Robust Dexterous In-hand Reorientation

Arjun Bhardwaj; Maximum Wilder-Smith; Mayank Mittal; Vaishakh Patil; Marco Hutter

arXiv:2604.11138·cs.RO·April 14, 2026

ViserDex: Visual Sim-to-Real for Robust Dexterous In-hand Reorientation

Arjun Bhardwaj, Maximum Wilder-Smith, Mayank Mittal, Vaishakh Patil, Marco Hutter

PDF

1 Repo

TL;DR

ViserDex introduces a sim-to-real framework using 3D Gaussian Splatting and domain randomization for robust monocular RGB in-hand object reorientation, enabling effective policy training on consumer hardware.

Contribution

The paper presents a novel sim-to-real approach with Gaussian Splatting and curriculum reinforcement learning for dexterous manipulation, reducing hardware and computational requirements.

Findings

01

Outperforms conventional rendering-based pose estimators in challenging environments.

02

Successfully reorients diverse objects with a multi-fingered hand under difficult lighting.

03

Perception and control models trained independently on consumer-grade hardware.

Abstract

In-hand object reorientation requires precise estimation of the object pose to handle complex task dynamics. While RGB sensing offers rich semantic cues for pose tracking, existing solutions rely on multi-camera setups or costly ray tracing. We present a sim-to-real framework for monocular RGB in-hand reorientation that integrates 3D Gaussian Splatting (3DGS) to bridge the visual sim-to-real gap. Our key insight is performing domain randomization in the Gaussian representation space: by applying physically consistent, pre-rendering augmentations to 3D Gaussians, we generate photorealistic, randomized visual data for object pose estimation. The manipulation policy is trained using curriculum-based reinforcement learning with teacher-student distillation, enabling efficient learning of complex behaviors. Importantly, both perception and control models can be trained independently on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://rffr.leggedrobotics.com/works/viserdex
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.