TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real   World

Hongpeng Lin; Ludan Ruan; Wenke Xia; Peiyu Liu; Jingyuan Wen; Yixin; Xu; Di Hu; Ruihua Song; Wayne Xin Zhao; Qin Jin; Zhiwu Lu

arXiv:2301.05880·cs.CL·September 11, 2023·1 cites

TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World

Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin, Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu

PDF

Open Access 1 Repo

TL;DR

TikTalk is a new large-scale video-based multi-modal dialogue dataset that captures real-world chitchat, presenting unique challenges and opportunities for developing more human-like multi-modal conversational AI.

Contribution

The paper introduces TikTalk, a comprehensive video-based multi-modal dialogue dataset with diverse context types, and evaluates baseline models, highlighting the potential of LLMs and external knowledge integration.

Findings

01

Models with large language models generate more diverse responses.

02

Knowledge graph-based models perform best overall.

03

Current models still struggle with complex multi-modal understanding.

Abstract

To facilitate the research on intelligent and human-like chatbots with multi-modal context, we introduce a new video-based multi-modal dialogue dataset, called TikTalk. We collect 38K videos from a popular video-sharing platform, along with 367K conversations posted by users beneath them. Users engage in spontaneous conversations based on their multi-modal experiences from watching videos, which helps recreate real-world chitchat context. Compared to previous multi-modal dialogue datasets, the richer context types in TikTalk lead to more diverse conversations, but also increase the difficulty in capturing human interests from intricate multi-modal information to generate personalized responses. Moreover, external knowledge is more frequently evoked in our dataset. These facts reveal new challenges for multi-modal dialogue models. We quantitatively demonstrate the characteristics of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ruc-aimind/tiktalk
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsChild Development and Digital Technology · Speech and dialogue systems · ICT in Developing Communities