Supporting Experts with a Multimodal Machine-Learning-Based Tool for Human Behavior Analysis of Conversational Videos
Riku Arakawa, Kiyosu Maeda, Hiromu Yakura

TL;DR
This paper introduces Providence, a user-friendly, multimodal machine learning tool designed to assist experts in analyzing conversational videos by streamlining scene search and capturing human behavioral cues.
Contribution
The paper presents Providence, a visual programming tool that allows experts to combine machine learning algorithms without coding, improving efficiency and objectivity in conversational analysis.
Findings
High usability and satisfactory output in scene search tasks
Reduced cognitive load for users
Confirmed objectivity and reusability of the tool in real-world settings
Abstract
Multimodal scene search of conversations is essential for unlocking valuable insights into social dynamics and enhancing our communication. While experts in conversational analysis have their own knowledge and skills to find key scenes, a lack of comprehensive, user-friendly tools that streamline the processing of diverse multimodal queries impedes efficiency and objectivity. To solve it, we developed Providence, a visual-programming-based tool based on design considerations derived from a formative study with experts. It enables experts to combine various machine learning algorithms to capture human behavioral cues without writing code. Our study showed its preferable usability and satisfactory output with less cognitive load imposed in accomplishing scene search tasks of conversations, verifying the importance of its customizability and transparency. Furthermore, through the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Advanced Text Analysis Techniques · Team Dynamics and Performance
