EPD: Long-term Memory Extraction, Context-awared Planning and Multi-iteration Decision @ EgoPlan Challenge ICML 2024
Letian Shi, Qi Lv, Xiang Deng, Liqiang Nie

TL;DR
This paper introduces EPD, a novel framework for egocentric task planning that combines long-term memory extraction, context-aware planning, and multi-iteration decision-making, achieving over 53% accuracy on the ICML 2024 EgoPlan challenge.
Contribution
The paper presents a new three-stage planning framework that effectively integrates memory extraction, contextual planning, and iterative decision-making for egocentric tasks.
Findings
Achieved 53.85% planning accuracy on EgoPlan-Test set.
Effectively summarizes long videos into relevant memory information.
Demonstrates improved planning performance over baseline methods.
Abstract
In this technical report, we present our solution for the EgoPlan Challenge in ICML 2024. To address the real-world egocentric task planning problem, we introduce a novel planning framework which comprises three stages: long-term memory Extraction, context-awared Planning, and multi-iteration Decision, named EPD. Given the task goal, task progress, and current observation, the extraction model first extracts task-relevant memory information from the progress video, transforming the complex long video into summarized memory information. The planning model then combines the context of the memory information with fine-grained visual information from the current observation to predict the next action. Finally, through multi-iteration decision-making, the decision model comprehensively understands the task situation and current state to make the most realistic planning decision. On the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning
