Animation Synthesis Triggered by Vocal Mimics
Adrien Nivaggioli (LIX), Damien Rohmer (LIX)

TL;DR
This paper introduces a novel method for controlling animations through vocal mimicry of onomatopoeia sounds, enabling natural and flexible synchronization of animation events with user-recorded soundtracks.
Contribution
It presents an automatic analysis and synthesis approach that links voice mimics to specific animation events, allowing multiple stories and characters to be controlled with recorded sound sequences.
Findings
Effective automatic sound analysis for event detection
Flexible animation control using voice mimics
Multiple story and character generation demonstrated
Abstract
We propose a method leveraging the naturally time-related expressivity of our voice to control an animation composed of a set of short events. The user records itself mimicking onomatopoeia sounds such as "Tick", "Pop", or "Chhh" which are associated with specific animation events. The recorded soundtrack is automatically analyzed to extract every instant and types of sounds. We finally synthesize an animation where each event type and timing correspond with the soundtrack. In addition to being a natural way to control animation timing, we demonstrate that multiple stories can be efficiently generated by recording different voice sequences. Also, the use of more than one soundtrack allows us to control different characters with overlapping actions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
