Batteries, camera, action! Learning a semantic control space for expressive robot cinematography
Rogerio Bonatti, Arthur Bucker, Sebastian Scherer, Mustafa Mukadam and, Jessica Hodgins

TL;DR
This paper introduces a data-driven framework that maps semantic descriptors like 'calm' or 'establishing' to camera control parameters for autonomous aerial cinematography, making expressive shot generation accessible to non-technical users.
Contribution
It develops a semantic control space for camera parameters, based on human perception and cinematography principles, enabling intuitive editing of complex camera behaviors.
Findings
Model successfully generates shots with expected expressive qualities.
System generalizes across different scenes in simulation and real-world.
Participants rated generated shots as matching desired semantic descriptors.
Abstract
Aerial vehicles are revolutionizing the way film-makers can capture shots of actors by composing novel aerial and dynamic viewpoints. However, despite great advancements in autonomous flight technology, generating expressive camera behaviors is still a challenge and requires non-technical users to edit a large number of unintuitive control parameters. In this work, we develop a data-driven framework that enables editing of these complex camera positioning parameters in a semantic space (e.g. calm, enjoyable, establishing). First, we generate a database of video clips with a diverse range of shots in a photo-realistic simulator, and use hundreds of participants in a crowd-sourcing framework to obtain scores for a set of semantic descriptors for each clip. Next, we analyze correlations between descriptors and build a semantic control space based on cinematography guidelines and human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
