FalconApp: Rapid iPhone Deployment of End-to-End Perception via Automatically Labeled Synthetic Data
Yan Miao, Will Shen, Sayan Mitra

TL;DR
FalconApp enables rapid, on-device perception model deployment for robotics by transforming short object captures into synthetic training data with minimal time and high accuracy.
Contribution
It introduces a fast mobile pipeline with photorealistic auto-labeling that significantly reduces data annotation effort for perception tasks.
Findings
Perception models trained with FalconApp achieve around 30 ms latency on iPhone.
Models outperform PnP baseline in pose accuracy on most tested objects.
Synthetic data generation and training take approximately 20 minutes per object.
Abstract
Reliable perception for robotics depends on large-scale labeled data, yet real-world datasets rely on heavy manual annotation and are time-consuming to produce. We present FalconApp, an iPhone app with an end-to-end frontend-backend pipeline that turns a short handheld capture of a rigid object into a perception module for mask detection and 6-DoF pose estimation. Our core contribution is a rapid mobile deployment pipeline paired with a photorealistic auto-labeling workflow: from a user-captured video of an object, FalconApp reconstructs an editable GSplat asset, composites it with diverse photorealistic backgrounds, renders synthetic images with ground-truth masks and poses, trains the perception module, and deploys it back to the iPhone frontend. Experiments across five rigid objects with diverse geometry and appearance show that FalconApp produces usable perception models with about…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
