Generative AI Framework for 3D Object Generation in Augmented Reality

Majid Behravan

arXiv:2502.15869·cs.GR·February 25, 2025

Generative AI Framework for 3D Object Generation in Augmented Reality

Majid Behravan

PDF

Open Access

TL;DR

This paper introduces a comprehensive framework that combines generative AI and augmented reality to enable real-time, user-friendly creation of 3D objects from various inputs, broadening accessibility and application scope.

Contribution

It presents a novel integrated system utilizing advanced AI models and multimodal inputs for real-time 3D object generation in AR, making the technology more accessible and versatile.

Findings

01

Effective conversion of images and speech into 3D models

02

Enhanced accuracy in object detection and model generation

03

Successful real-time integration of 3D objects into AR environments

Abstract

This thesis presents a framework that integrates state-of-the-art generative AI models for real-time creation of three-dimensional (3D) objects in augmented reality (AR) environments. The primary goal is to convert diverse inputs, such as images and speech, into accurate 3D models, enhancing user interaction and immersion. Key components include advanced object detection algorithms, user-friendly interaction techniques, and robust AI models like Shap-E for 3D generation. Leveraging Vision Language Models (VLMs) and Large Language Models (LLMs), the system captures spatial details from images and processes textual information to generate comprehensive 3D objects, seamlessly integrating virtual objects into real-world environments. The framework demonstrates applications across industries such as gaming, education, retail, and interior design. It allows players to create personalized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Modeling in Geospatial Applications · Augmented Reality Applications · Image Processing and 3D Reconstruction