A Machine Learning Approach to Classifying Construction Cost Documents into the International Construction Measurement Standard
J. Ignacio Deza, Hisham Ihshaish, Lamine Mahdjoubi

TL;DR
This paper presents automated machine learning models for classifying construction cost documents into the ICMS, achieving over 90% accuracy using neural network architectures trained on a large UK dataset.
Contribution
It introduces the first automated approach for classifying BoQ descriptions into ICMS categories, comparing different language representations and neural network models.
Findings
Models achieved above 90% F1 score accuracy.
Simpler models like TCN perform comparably to complex recurrent models.
Local key features in text are crucial for classification.
Abstract
We introduce the first automated models for classifying natural language descriptions provided in cost documents called "Bills of Quantities" (BoQs) popular in the infrastructure construction industry, into the International Construction Measurement Standard (ICMS). The models we deployed and systematically evaluated for multi-class text classification are learnt from a dataset of more than 50 thousand descriptions of items retrieved from 24 large infrastructure construction projects across the United Kingdom. We describe our approach to language representation and subsequent modelling to examine the strength of contextual semantics and temporal dependency of language used in construction project documentation. To do that we evaluate two experimental pipelines to inferring ICMS codes from text, on the basis of two different language representation models and a range of state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrastructure Maintenance and Monitoring · Occupational Health and Safety Research · linguistics and terminology studies
