Team PKU-WICT-MIPL PIC Makeup Temporal Video Grounding Challenge 2022 Technical Report
Minghang Zheng, Dejie Yang, Zhongjie Ye, Ting Lei, Yuxin Peng, Yang, Liu

TL;DR
This technical report presents a solution for the MTVG challenge that localizes makeup steps in videos using phrase relationship mining and dynamic programming, achieving high accuracy and ranking second.
Contribution
The paper introduces a novel phrase relationship mining framework and a dynamic programming approach for non-overlapping temporal localization in makeup videos.
Findings
Achieved second place in the MTVG challenge with a 0.55% gap from the top.
Demonstrated the effectiveness of phrase relationship mining for temporal localization.
Validated the proposed methods through competitive experimental results.
Abstract
In this technical report, we briefly introduce the solutions of our team `PKU-WICT-MIPL' for the PIC Makeup Temporal Video Grounding (MTVG) Challenge in ACM-MM 2022. Given an untrimmed makeup video and a step query, the MTVG aims to localize a temporal moment of the target makeup step in the video. To tackle this task, we propose a phrase relationship mining framework to exploit the temporal localization relationship relevant to the fine-grained phrase and the whole sentence. Besides, we propose to constrain the localization results of different step sentence queries to not overlap with each other through a dynamic programming algorithm. The experimental results demonstrate the effectiveness of our method. Our final submission ranked 2nd on the leaderboard, with only a 0.55\% gap from the first.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Human Pose and Action Recognition
