Motion-to-Response Content Generation via Multi-Agent AI System with Real-Time Safety Verification
HyeYoung Lee

TL;DR
This paper presents a multi-agent AI system that generates real-time response media content from audio emotional cues, ensuring safety and appropriateness through a structured, safety-verified pipeline.
Contribution
It introduces a novel multi-agent architecture with explicit safety verification for real-time, response-oriented media content generation from emotional signals.
Findings
Achieved 73.2% emotion recognition accuracy
Reached 89.4% response mode consistency
Ensured 100% safety compliance
Abstract
This paper proposes a multi-agent artificial intelligence system that generates response-oriented media content in real time based on audio-derived emotional signals. Unlike conventional speech emotion recognition studies that focus primarily on classification accuracy, our approach emphasizes the transformation of inferred emotional states into safe, age-appropriate, and controllable response content through a structured pipeline of specialized AI agents. The proposed system comprises four cooperative agents: (1) an Emotion Recognition Agent with CNN-based acoustic feature extraction, (2) a Response Policy Decision Agent for mapping emotions to response modes, (3) a Content Parameter Generation Agent for producing media control parameters, and (4) a Safety Verification Agent enforcing age-appropriateness and stimulation constraints. We introduce an explicit safety verification loop…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Speech Recognition and Synthesis · Face recognition and analysis
