Action parsing using context features

Nagita Mehrseresht

arXiv:2205.10008·cs.CV·May 23, 2022

Action parsing using context features

Nagita Mehrseresht

PDF

TL;DR

This paper introduces a novel action parsing algorithm that leverages context features and dynamic programming to improve video segmentation accuracy, demonstrated on the Breakfast dataset.

Contribution

It presents a new action parsing method that incorporates context features and dynamic programming for better segmentation of video sequences.

Findings

01

Improved segmentation accuracy over existing methods

02

Effective use of context features enhances action segmentation

03

Demonstrated on Breakfast dataset with positive results

Abstract

We propose an action parsing algorithm to parse a video sequence containing an unknown number of actions into its action segments. We argue that context information, particularly the temporal information about other actions in the video sequence, is valuable for action segmentation. The proposed parsing algorithm temporally segments the video sequence into action segments. The optimal temporal segmentation is found using a dynamic programming search algorithm that optimizes the overall classification confidence score. The classification score of each segment is determined using local features calculated from that segment as well as context features calculated from other candidate action segments of the sequence. Experimental results on the Breakfast activity data-set showed improved segmentation accuracy compared to existing state-of-the-art parsing techniques.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.