Towards Active Vision for Action Localization with Reactive Control and Predictive Learning
Shubham Trehan, Sathyanarayanan N. Aakur

TL;DR
This paper introduces an active vision system that localizes actions by controlling a camera using predictive learning and reactive control, eliminating the need for training data or rewards, and demonstrating strong generalization in various environments.
Contribution
It proposes a novel energy-based mechanism combining predictive learning and reactive control for active action localization without training data or rewards.
Findings
Outperforms unsupervised baselines in action localization tasks
Generalizes across different tasks and environments in streaming settings
Achieves competitive performance without explicit training or reward signals
Abstract
Visual event perception tasks such as action localization have primarily focused on supervised learning settings under a static observer, i.e., the camera is static and cannot be controlled by an algorithm. They are often restricted by the quality, quantity, and diversity of \textit{annotated} training data and do not often generalize to out-of-domain samples. In this work, we tackle the problem of active action localization where the goal is to localize an action while controlling the geometric and physical parameters of an active camera to keep the action in the field of view without training data. We formulate an energy-based mechanism that combines predictive learning and reactive control to perform active action localization without rewards, which can be sparse or non-existent in real-world environments. We perform extensive experiments in both simulated and real-world environments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Towards Active Vision for Action Localization with Reactive Control and Predictive Learning· youtube
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Vision and Imaging · Advanced Optical Sensing Technologies
