NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to the Ego4D Moment Queries Challenge 2023
Lin Sui, Fangzhou Mu, Yin Li

TL;DR
This paper presents a top-performing solution for the Ego4D Moment Queries Challenge 2023, enhancing ActionFormer with improved training and inference strategies, achieving high accuracy in temporal action localization.
Contribution
The authors extend ActionFormer by integrating an improved ground-truth assignment and a refined SoftNMS, leading to superior performance in ego-centric moment query detection.
Findings
Achieved 26.62% average mAP on the test set.
Ranked 2nd in the Ego4D Moment Queries Challenge 2023.
Significantly outperformed the baseline from the challenge.
Abstract
This report describes our submission to the Ego4D Moment Queries Challenge 2023. Our submission extends ActionFormer, a latest method for temporal action localization. Our extension combines an improved ground-truth assignment strategy during training and a refined version of SoftNMS at inference time. Our solution is ranked 2nd on the public leaderboard with 26.62% average mAP and 45.69% Recall@1x at tIoU=0.5 on the test set, significantly outperforming the strong baseline from 2023 challenge. Our code is available at https://github.com/happyharrycn/actionformer_release.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Video Analysis and Summarization
