Curriculum-Based Strategies for Efficient Cross-Domain Action Recognition
Emily Kim, Allen Wu, Jessica Hodgins

TL;DR
This paper demonstrates that curriculum-based training strategies can significantly improve the efficiency of cross-domain human action recognition models, especially when transferring from ground-view to aerial-view data, without using aerial data during training.
Contribution
It introduces two curriculum learning methods for cross-view action recognition that enhance training efficiency while maintaining high accuracy, using synthetic and real ground-view data.
Findings
Combining synthetic aerial and real ground data outperforms single-domain training.
Curriculum strategies reduce training iterations by up to 37% for CNN and 30% for Transformer models.
Performance remains within 3% accuracy of simple dataset combination.
Abstract
Despite significant progress in human action recognition, generalizing to diverse viewpoints remains a challenge. Most existing datasets are captured from ground-level perspectives, and models trained on them often struggle to transfer to drastically different domains such as aerial views. This paper examines how curriculum-based training strategies can improve generalization to unseen real aerial-view data without using any real aerial data during training. We explore curriculum learning for cross-view action recognition using two out-of-domain sources: synthetic aerial-view data and real ground-view data. Our results on the evaluation on order of training (fine-tuning on synthetic aerial data vs. real ground data) shows that fine-tuning on real ground data but differ in how they transition from synthetic to real. The first uses a two-stage curriculum with direct fine-tuning, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Context-Aware Activity Recognition Systems · Hand Gesture Recognition Systems
