Generative AI Framework for 3D Object Generation in Augmented Reality
Majid Behravan

TL;DR
This paper introduces a comprehensive framework that combines generative AI and augmented reality to enable real-time, user-friendly creation of 3D objects from various inputs, broadening accessibility and application scope.
Contribution
It presents a novel integrated system utilizing advanced AI models and multimodal inputs for real-time 3D object generation in AR, making the technology more accessible and versatile.
Findings
Effective conversion of images and speech into 3D models
Enhanced accuracy in object detection and model generation
Successful real-time integration of 3D objects into AR environments
Abstract
This thesis presents a framework that integrates state-of-the-art generative AI models for real-time creation of three-dimensional (3D) objects in augmented reality (AR) environments. The primary goal is to convert diverse inputs, such as images and speech, into accurate 3D models, enhancing user interaction and immersion. Key components include advanced object detection algorithms, user-friendly interaction techniques, and robust AI models like Shap-E for 3D generation. Leveraging Vision Language Models (VLMs) and Large Language Models (LLMs), the system captures spatial details from images and processes textual information to generate comprehensive 3D objects, seamlessly integrating virtual objects into real-world environments. The framework demonstrates applications across industries such as gaming, education, retail, and interior design. It allows players to create personalized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Modeling in Geospatial Applications · Augmented Reality Applications · Image Processing and 3D Reconstruction
