ObjectMix: Data Augmentation by Copy-Pasting Objects in Videos for Action Recognition
Jun Kimata, Tomoya Nitta, Toru Tamaki

TL;DR
ObjectMix is a novel data augmentation technique for action recognition that combines object regions from different videos using instance segmentation, improving performance on UCF101 and HMDB51 datasets.
Contribution
It introduces ObjectMix, a new method that leverages instance segmentation to create augmented videos by combining objects from different videos, tailored for action recognition.
Findings
Outperforms VideoMix in experiments
Effective on UCF101 and HMDB51 datasets
Enhances action recognition accuracy
Abstract
In this paper, we propose a data augmentation method for action recognition using instance segmentation. Although many data augmentation methods have been proposed for image recognition, few of them are tailored for action recognition. Our proposed method, ObjectMix, extracts each object region from two videos using instance segmentation and combines them to create new videos. Experiments on two action recognition datasets, UCF101 and HMDB51, demonstrate the effectiveness of the proposed method and show its superiority over VideoMix, a prior work.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Video Analysis and Summarization
