Enhancing Building Semantics Preservation in AI Model Training with Large Language Model Encodings
Suhyung Jang, Ghang Lee, Jaekun Lee, and Hyunjun Lee

TL;DR
This paper introduces a novel AI training method using large language model embeddings to better preserve detailed building semantics, significantly improving classification accuracy over traditional encoding methods in BIM data analysis.
Contribution
The study demonstrates that LLM-based embeddings outperform one-hot encodings in classifying building subtypes, with effective dimensionality reduction techniques maintaining high performance.
Findings
LLM embeddings achieved higher F1-scores than one-hot encoding.
Compacted llama-3 embeddings maintained strong classification performance.
The approach enhances AI understanding of complex building semantics.
Abstract
Accurate representation of building semantics, encompassing both generic object types and specific subtypes, is essential for effective AI model training in the architecture, engineering, construction, and operation (AECO) industry. Conventional encoding methods (e.g., one-hot) often fail to convey the nuanced relationships among closely related subtypes, limiting AI's semantic comprehension. To address this limitation, this study proposes a novel training approach that employs large language model (LLM) embeddings (e.g., OpenAI GPT and Meta LLaMA) as encodings to preserve finer distinctions in building semantics. We evaluated the proposed method by training GraphSAGE models to classify 42 building object subtypes across five high-rise residential building information models (BIMs). Various embedding dimensions were tested, including original high-dimensional LLM embeddings (1,536,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBIM and Construction Integration · Advanced Graph Neural Networks · Advanced Neural Network Applications
