Sidekick Policy Learning for Active Visual Exploration
Santhosh K. Ramakrishnan, Kristen Grauman

TL;DR
This paper introduces sidekick policy learning for active visual exploration, enabling agents to efficiently reconstruct environments with limited views by leveraging preparatory learning phases and novel visualization techniques.
Contribution
It proposes a new sidekick policy learning framework that improves exploration efficiency and convergence, supported by a novel visualization method.
Findings
Sidekick learning improves exploration performance.
Method accelerates convergence compared to existing approaches.
Effective in 360 scenes and 3D object exploration tasks.
Abstract
We consider an active visual exploration scenario, where an agent must intelligently select its camera motions to efficiently reconstruct the full environment from only a limited set of narrow field-of-view glimpses. While the agent has full observability of the environment during training, it has only partial observability once deployed, being constrained by what portions it has seen and what camera motions are permissible. We introduce sidekick policy learning to capitalize on this imbalance of observability. The main idea is a preparatory learning phase that attempts simplified versions of the eventual exploration task, then guides the agent via reward shaping or initial policy supervision. To support interpretation of the resulting policies, we also develop a novel policy visualization technique. Results on active visual exploration tasks with 360 scenes and 3D objects show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
