Winning the CVPR'2022 AQTC Challenge: A Two-stage Function-centric Approach
Shiwei Wu, Weidong He, Tong Xu, Hao Wang, Enhong Chen

TL;DR
This paper presents a two-stage function-centric approach for the novel AQTC task, enabling AI assistants to learn from videos and scripts for step-by-step guidance, achieving top results in CVPR 2022.
Contribution
It introduces a new two-stage framework for AQTC, combining question grounding and action prediction, with significant improvements over baseline methods.
Findings
Achieved winning results at CVPR 2022 AQTC Challenge.
Demonstrated effectiveness of the two-stage approach.
Significant performance gains over baseline models.
Abstract
Affordance-centric Question-driven Task Completion for Egocentric Assistant(AQTC) is a novel task which helps AI assistant learn from instructional videos and scripts and guide the user step-by-step. In this paper, we deal with the AQTC via a two-stage Function-centric approach, which consists of Question2Function Module to ground the question with the related function and Function2Answer Module to predict the action based on the historical steps. We evaluated several possible solutions in each module and obtained significant gains compared to the given baselines. Our code is available at \url{https://github.com/starsholic/LOVEU-CVPR22-AQTC}.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Online Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning
