IRIS: Learning-Driven Task-Specific Cinema Robot Arm for Visuomotor Motion Control
Qilong Cheng, Matthew Mackay, Ali Bereyhi

TL;DR
IRIS is an affordable, learning-driven robotic camera system that autonomously generates smooth, object-aware cinematic motions from human demonstrations, reducing complexity and cost in robotic cinematography.
Contribution
The paper introduces IRIS, a low-cost, task-specific robotic camera arm utilizing a novel visuomotor imitation learning framework with Action Chunking Transformers.
Findings
Achieves approximately 1 mm repeatability in motion control.
Supports autonomous, object-aware cinematic camera trajectories.
Costs under $1,000 USD and supports a 1.5 kg payload.
Abstract
Robotic camera systems enable dynamic, repeatable motion beyond human capabilities, yet their adoption remains limited by the high cost and operational complexity of industrial-grade platforms. We present the Intelligent Robotic Imaging System (IRIS), a task-specific 6-DOF manipulator designed for autonomous, learning-driven cinematic motion control. IRIS integrates a lightweight, fully 3D-printed hardware design with a goal-conditioned visuomotor imitation learning framework based on Action Chunking with Transformers (ACT). The system learns object-aware and perceptually smooth camera trajectories directly from human demonstrations, eliminating the need for explicit geometric programming. The complete platform costs under $1,000 USD, supports a 1.5 kg payload, and achieves approximately 1 mm repeatability. Real-world experiments demonstrate accurate trajectory tracking, reliable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robot Manipulation and Learning · Teleoperation and Haptic Systems
