Box2Flow: Instance-based Action Flow Graphs from Videos
Jiatong Li, Kalliopi Basioti, Vladimir Pavlovic

TL;DR
Box2Flow is a novel method that extracts detailed, instance-based step flow graphs from individual procedural videos, capturing complex task structures more accurately than previous abstract approaches.
Contribution
It introduces an instance-based approach to learn rich flow graphs from single videos, improving detail and accuracy over prior methods that learn a single graph for all videos.
Findings
Effective extraction of flow graphs from videos
Outperforms existing methods on MM-ReS and YouCookII datasets
Captures detailed step relationships and variations
Abstract
A large amount of procedural videos on the web show how to complete various tasks. These tasks can often be accomplished in different ways and step orderings, with some steps able to be performed simultaneously, while others are constrained to be completed in a specific order. Flow graphs can be used to illustrate the step relationships of a task. Current task-based methods try to learn a single flow graph for all available videos of a specific task. The extracted flow graphs tend to be too abstract, failing to capture detailed step descriptions. In this work, our aim is to learn accurate and rich flow graphs by extracting them from a single video. We propose Box2Flow, an instance-based method to predict a step flow graph from a given procedural video. In detail, we extract bounding boxes from videos, predict pairwise edge probabilities between step pairs, and build the flow graph with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
