Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge
Young-Jun Lee, Dokyong Lee, Junyoung Youn, Kyeongjin Oh, Byungsoo Ko,, Jonghwan Hyeon, Ho-Jin Choi

TL;DR
Stark is a large-scale dataset and framework for long-term multi-modal social conversations incorporating images and personas, enabling improved multi-modal dialogue modeling and visual imagination.
Contribution
The paper introduces Stark, a novel long-term multi-modal conversation dataset with a multi-modality framework and a new multi-modal dialogue model, Ultron 7B.
Findings
Ultron 7B demonstrates strong visual imagination capabilities.
Stark dataset improves multi-modal social conversation modeling.
Human evaluation confirms dataset effectiveness.
Abstract
Humans share a wide variety of images related to their personal experiences within conversations via instant messaging tools. However, existing works focus on (1) image-sharing behavior in singular sessions, leading to limited long-term social interaction, and (2) a lack of personalized image-sharing behavior. In this work, we introduce Stark, a large-scale long-term multi-modal conversation dataset that covers a wide range of social personas in a multi-modality format, time intervals, and images. To construct Stark automatically, we propose a novel multi-modal contextualization framework, Mcu, that generates long-term multi-modal dialogue distilled from ChatGPT and our proposed Plan-and-Execute image aligner. Using our Stark, we train a multi-modal conversation model, Ultron 7B, which demonstrates impressive visual imagination ability. Furthermore, we demonstrate the effectiveness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersona Design and Applications · Innovative Human-Technology Interaction · Information Systems Theories and Implementation
MethodsFocus
