ArK: Augmented Reality with Knowledge Interactive Emergent Ability
Qiuyuan Huang, Jae Sung Park, Abhinav Gupta, Paul Bennett, Ran Gong,, Subhojit Som, Baolin Peng, Owais Khan Mohammed, Chris Pal, Yejin Choi,, Jianfeng Gao

TL;DR
This paper introduces ArK, a novel approach that leverages foundation models to transfer knowledge for high-quality scene generation in unseen environments, reducing the need for extensive data collection.
Contribution
We propose ArK, an innovative framework that enables knowledge transfer from foundation models to improve scene understanding and generation in new domains.
Findings
Significantly improved scene quality over baselines
Effective transfer of knowledge memory from foundation models
Enhanced scene editing capabilities in virtual and physical environments
Abstract
Despite the growing adoption of mixed reality and interactive AI agents, it remains challenging for these systems to generate high quality 2D/3D scenes in unseen environments. The common practice requires deploying an AI agent to collect large amounts of data for model training for every new task. This process is costly, or even impossible, for many domains. In this study, we develop an infinite agent that learns to transfer knowledge memory from general foundation models (e.g. GPT4, DALLE) to novel domains or scenarios for scene understanding and generation in the physical or virtual world. The heart of our approach is an emerging mechanism, dubbed Augmented Reality with Knowledge Inference Interaction (ArK), which leverages knowledge-memory to generate scenes in unseen physical world and virtual reality environments. The knowledge interactive emergent ability (Figure 1) is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis · Augmented Reality Applications
