Geometric Algebra Meets Large Language Models: Instruction-Based Transformations of Separate Meshes in 3D, Interactive and Controllable Scenes
Prodromos Kolyvakis, Manos Kamarianakis, George Papagiannakis

TL;DR
This paper presents Shenlong, a system integrating Large Language Models with Conformal Geometric Algebra to enable precise, natural language-driven 3D scene editing, significantly improving efficiency and success rates over traditional methods.
Contribution
The novel integration of LLMs with CGA for real-time, instruction-based 3D object transformations without specialized training.
Findings
Shenlong reduces response times by 16%.
Achieves a 9.6% higher success rate than baselines.
Attains 100% success in common queries.
Abstract
This paper introduces a novel integration of Large Language Models (LLMs) with Conformal Geometric Algebra (CGA) to revolutionize controllable 3D scene editing, particularly for object repositioning tasks, which traditionally requires intricate manual processes and specialized expertise. These conventional methods typically suffer from reliance on large training datasets or lack a formalized language for precise edits. Utilizing CGA as a robust formal language, our system, Shenlong, precisely models spatial transformations necessary for accurate object repositioning. Leveraging the zero-shot learning capabilities of pre-trained LLMs, Shenlong translates natural language instructions into CGA operations which are then applied to the scene, facilitating exact spatial transformations within 3D scenes without the need for specialized pre-training. Implemented in a realistic simulation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques
