An LLM-Assisted Toolkit for Inspectable Multimodal Emotion Data Annotation
Zheyuan Kuang, Weiwei Jiang, Nicholas Koemel, Matthew Ahmadi, Emmanuel Stamatakis, Benjamin Tag, Anusha Withana, Zhanna Sarsenbayeva

TL;DR
This paper introduces an LLM-assisted toolkit designed to improve the annotation process of multimodal emotion data by providing an inspectable, event-centered workflow that enhances alignment, visualization, and structured annotation of complex multimodal recordings.
Contribution
The paper presents a novel toolkit that integrates LLMs with multimodal data processing to facilitate scalable, inspectable, and accurate emotion data annotation workflows.
Findings
Effective alignment of heterogeneous multimodal recordings.
Enhanced visualization of multimodal signals on shared timelines.
Successful demonstration on VR emotion recordings.
Abstract
Multimodal Emotion Recognition (MER) increasingly depends on fine grained, evidence grounded annotations, yet inspection and label construction are hard to scale when cues are dynamic and misaligned across modalities. We present an LLM-assisted toolkit that supports multimodal emotion data annotation through an inspectable, event centered workflow. The toolkit preprocesses and aligns heterogeneous recordings, visualizes all modalities on an interactive shared timeline, and renders structured signals as video tracks for cross modal consistency checks. It then detects candidate events and packages synchronized keyframes and time windows as event packets with traceable pointers to the source data. Finally, the toolkit integrates an LLM with modality specific tools and prompt templates to draft structured annotations for analyst verification and editing. We demonstrate the workflow on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Music and Audio Processing · Human Pose and Action Recognition
