VRM-Phase I VKW system description of long-short video customizable keyword wakeup challenge
Yougen Yuan, Zhiqiang Lv, Shen Huang, Pengfei Hu

TL;DR
This paper describes the VKW challenge for building flexible Chinese video keyword wakeup systems capable of handling multiple and custom keywords, with experimental results from participating teams.
Contribution
It introduces a new public dataset and evaluation framework for Chinese video keyword wakeup, fostering research in customizable wakeup systems.
Findings
Participating systems support multiple keywords.
Systems can wake up on custom keywords.
Experimental results demonstrate system effectiveness.
Abstract
Keyword wakeup technology has always been a research hotspot in speech processing, but many related works were done on different datasets. We organized a Chinese long-short video keyword wakeup challenge (Video Keyword Wakeup Challenge, VKW) for testing the ability of each participating team to build a keyword wakeup system under the public dataset. All submitted systems not only need to support the setting of multiple different keywords, but also need to support the wakeup of any costumed keyword.This paper mainly describes the basic situation of the VKW challenge and the experimental results of some participating teams.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Video Analysis and Summarization · Speech Recognition and Synthesis
