Which Reconstruction Model Should a Robot Use? Routing Image-to-3D Models for Cost-Aware Robotic Manipulation

Akash Anand; Aditya Agarwal; Leslie Pack Kaelbling

arXiv:2603.27797·cs.RO·March 31, 2026

Which Reconstruction Model Should a Robot Use? Routing Image-to-3D Models for Cost-Aware Robotic Manipulation

Akash Anand, Aditya Agarwal, Leslie Pack Kaelbling

PDF

1 Repo

TL;DR

This paper introduces SCOUT, a routing framework for selecting optimal 3D reconstruction models in robotic manipulation, balancing quality and computational cost under various constraints.

Contribution

SCOUT decouples viewpoint-dependent model performance from view-invariant factors, enabling flexible, cost-aware model selection without retraining.

Findings

01

SCOUT outperforms routing baselines across multiple datasets and metrics.

02

The framework supports arbitrary cost constraints at inference time.

03

Validated through robotic grasping and manipulation experiments.

Abstract

Robotic manipulation tasks require 3D mesh reconstructions of varying quality: dexterous manipulation demands fine-grained surface detail, while collision-free planning tolerates coarser representations. Multiple reconstruction methods offer different cost-quality tradeoffs, from Image-to-3D models - whose output quality depends heavily on the input viewpoint - to view-invariant methods such as structured light scanning. Querying all models is computationally prohibitive, motivating per-input model selection. We propose SCOUT, a novel routing framework that decouples reconstruction scores into two components: (1) the relative performance of viewpoint-dependent models, captured by a learned probability distribution, and (2) the overall image difficulty, captured by a scalar partition function estimate. As the learned network operates only over the viewpoint-dependent models,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.