Automatic reconstruction of fully volumetric 3D building models from point clouds
Sebastian Ochmann, Richard Vock, Reinhard Klein

TL;DR
This paper introduces an automatic method for reconstructing detailed volumetric 3D building models from unstructured indoor point clouds using integer linear programming, enabling precise and consistent models without prior segmentation.
Contribution
It presents a fully automatic approach that combines room segmentation, outlier removal, and an integer linear optimization to produce accurate volumetric building models from raw point cloud data.
Findings
Successfully reconstructs complex building models from real-world data
Enforces volumetric, interconnected wall structures for realistic models
Uses exact integer linear programming for optimal solutions
Abstract
We present a novel method for reconstructing parametric, volumetric, multi-story building models from unstructured, unfiltered indoor point clouds by means of solving an integer linear optimization problem. Our approach overcomes limitations of previous methods in several ways: First, we drop assumptions about the input data such as the availability of separate scans as an initial room segmentation. Instead, a fully automatic room segmentation and outlier removal is performed on the unstructured point clouds. Second, restricting the solution space of our optimization approach to arrangements of volumetric wall entities representing the structure of a building enforces a consistent model of volumetric, interconnected walls fitted to the observed data instead of unconnected, paper-thin surfaces. Third, we formulate the optimization as an integer linear programming problem which allows for…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21
Figure 22
Figure 23| No scan positions | Non-Manhattan | Multiple rooms | Full 3D recons.1 | Slanted ceilings | Volumetric walls | |
| Budroni ’10 [8] | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Adán ’11 [9] | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ |
| Xiong ’13 [10] | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ |
| Mura ’14 [11] | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ |
| Oesau ’14 [2] | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ |
| Previtali ’14 [12] | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ |
| Turner ’15 [3] | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ |
| Mura ’16 [4] | ✗ | ✓ | ✓ | ✓ | ✓ | ✗ |
| Ochmann ’16 [5] | ✗ | ✓ | ✓ | ✗ | ✗ | ✓ |
| Ambruş ’17 [13] | ✓2 | ✓ | ✓ | ✗ | ✗ | ✗ |
| Macher ’17 [14] | ✓ | ✓ | ✓ | ✗ | ✗ | ✓3 |
| Murali ’17 [6] | ✓ | ✗ | ✓ | ✗ | ✗ | ✓ |
| Wang ’17 [15] | ✗4 | ✓ | ✓ | ✗ | ✗ | ✗ |
| Ours | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ |
| Dataset 1 | Dataset 2 | Dataset 3 | |
| Input | |||
| #scans / #points / #pts. cleaned | 12 / 3168600 / 2702813 | 21 / 5151388 / 4723219 | 29 / 7688111 / 5874557 |
| #Entities | |||
| Room labels / Surfaces1 / Walls1 | 27 / 42+5 / 37+5 | 30 / 34+7 / 28+5 | 39 / 51+5 / 39+4 |
| Cells / Variables / Constraints | 17666 / 594748 / 1775298 | 12749 / 459373 / 1334699 | 17334 / 781794 / 2261980 |
| Nonzeros | 4174634 | 3134503 | 5315578 |
| Runtime (seconds) | |||
| Plane detection | 18.2 | 20.1 | 73.9 |
| Cleaning (3 iterations) | 14.1 | 21.9 | 32.7 |
| Auto labeling | 6.3 | 7.2 | 11.4 |
| Arrangement + Priors | 14.6 | 11.5 | 15.5 |
| Optimization | 20.9 | 7.4 | 9.8 |
|
|
|
|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Automatic reconstruction of fully volumetric 3D building models from point clouds
Sebastian Ochmann
Richard Vock
Reinhard Klein
University of Bonn, Institute of Computer Science II, Endenicher Allee 19a, 53115 Bonn, Germany
Abstract
We present a novel method for reconstructing parametric, volumetric, multi-story building models from unstructured, unfiltered indoor point clouds by means of solving an integer linear optimization problem. Our approach overcomes limitations of previous methods in several ways: First, we drop assumptions about the input data such as the availability of separate scans as an initial room segmentation. Instead, a fully automatic room segmentation and outlier removal is performed on the unstructured point clouds. Second, restricting the solution space of our optimization approach to arrangements of volumetric wall entities representing the structure of a building enforces a consistent model of volumetric, interconnected walls fitted to the observed data instead of unconnected, paper-thin surfaces. Third, we formulate the optimization as an integer linear programming problem which allows for an exact solution instead of the approximations achieved with most previous techniques. Lastly, our optimization approach is designed to incorporate hard constraints which were difficult or even impossible to integrate before. We evaluate and demonstrate the capabilities of our proposed approach on a variety of complex real-world point clouds.
keywords:
\KWDIndoor Building Reconstruction , Point Cloud Processing , Integer Linear Programming , Building Information Modeling
††journal: ISPRS Journal of Photogrammetry and Remote Sensing
1 Introduction
The challenging problem of generating high-quality, three-dimensional building models from point cloud scans has been approached in a variety of ways in recent years by the computer graphics and remote sensing communities as well as in the architecture domain. Especially for various applications in Computer Aided Design (CAD) and emerging fields such as Building Information Modeling (BIM), the reconstructed models are usually required to adhere to industry-standard specifications such as the Industry Foundation Classes (IFC). In contrast to the representation of a building in the form of e.g. an unordered point cloud, a set of unconnected surfaces, or boundary meshes, a BIM/IFC model closely resembles the physical building structure by defining buildings as semantically annotated, volumetric building entities such as walls and floor slabs, usually including additional information regarding how these elements are interconnected.
Most previous approaches focus on the reconstruction of completely separate, planar surfaces without additional information regarding how they relate to each other [1], or on representing buildings as watertight boundaries of either the whole building [2] or separate rooms [3, 4] and are thereby lacking in providing insights into the building structure. Also, assumptions such as the one that stories can be globally separated by horizontal planes are very limiting in practice. None of these approaches yields a representation which enables unhindered usage in the aforementioned scenario. While one recent approach [5] does model the measured point cloud data using volumetric building entities, the method is restricted to single-story buildings which limits its usability without laborious manual separation of the point cloud data into separate stories. Additionally, generation of resulting wall and floor slab elements is done in a post-processing step without being integrated into the used optimization framework which may result in locally implausible results. Other methods [6, 7] aiming at reconstructing true BIM models make the severe assumption that walls are positioned in a Manhattan world constellation which is often violated by real-world buildings.
Our proposed method overcomes limitations of previous approaches by alleviating the requirements on the input data and by providing a flexible optimization framework for indoor building reconstruction. Some prior methods (e.g. [4, 5]) require separate scans and scan positions to derive an initial, coarse segmentation into rooms. In contrast, our fully automatic room segmentation approach, by design, does not depend on the availability of such information and does not impose particular rules for scanning (e.g. one scan per room). Furthermore, our novel integer linear programming formulation for the reconstruction problem provides flexible means to steer the reconstruction process while globally constraining the solution space to feasible solutions, thus guaranteeing a plausible model. Additional information such as manually augmented hints may optionally be incorporated by means of hard constraints in order to further guide the reconstruction process. While some previous approaches regularize the resulting model based on room boundary complexity, they fail to account for dependencies between surfaces related by volumetric wall elements, e.g. opposing surfaces between neighboring rooms. Our formulation of the solution space based on volumetric entities enables better regularization of the model with respect to the actual volumetric walls and slabs used to represent the building. In contrast to any previous approach, the result of our optimization immediately yields the complete geometry of all reconstructed walls and slabs as well as their volumetric intersections which allows for a direct generation of plausible BIM/IFC models.
In summary, the main features of our approach are:
Fully automatic, volumetric reconstruction including volumetric intersections between elements. 2. 2.
Flexible integration of constraints to enforce global and local properties of the resulting model.
Our main technical contributions are:
Automatic filtering of outliers and room segmentation of unstructured, multi-story 3D point clouds. 2. 2.
A new formulation of the indoor reconstruction task as a linear integer programming problem that can be efficiently solved using off-the-shelf software.
2 Related Work
Research on scan-to-BIM and related approaches led to a wide range of developments in recent years and still is a current topic of ongoing work. We first provide a comprehensive overview of methods dealing specifically with indoor building reconstruction which we then complement with a summary of more loosely related but complementary abstraction approaches and applications.
2.1 Indoor building reconstruction
The works presented in this section are closely related to our goal of indoor building reconstruction. Table 1 summarizes and compares key features of different approaches.
Some methods aim at the generation of 2D floor plans. Okorn et al. [16] model 2D floor plans by projecting detected structures into the horizontal plane and performing wall segment detection based on the Hough transform. Ambruş et al. [13] reconstruct floor plans including a room labeling obtained using an energy minimization approach. A deep neural architecture for automatic floor plan generation from RGBD video has been presented by Liu et al. [7]. Using pixel-wise predictions of floor plan geometry and semantics, integer programming [17] is used to recover a vector graphics reconstruction.
Some approaches perform a reconstruction of individual rooms. Budroni et al. [8] reconstruct closed boundary representations of single rooms using plane sweep surface detection and a 2D line arrangement with a split-and-merge approach. The methods by Adán et al. [9] and Xiong et al. [10] focus on recovering detailed surface labelings, explicitly reasoning about occlusions using a ray-tracing approach. In a similar spirit, Previtali et al. [12] perform a reconstruction of single rooms as polyhedral models including ray-tracing based reasoning about occlusions and opening detection.
Certain methods aim at the reconstruction of the building as a whole without explicitly considering room topology or segmentation. Sanchez et al. [1] represent buildings as polygonal surface models including detection of smaller-scale structures such as parametric staircases. Oesau et al. [18] use a 2D cell decomposition to perform binary inside/outside labeling using a Graph-Cut based optimization. The detail level of this approach is enhanced by Oesau et al. in [2] by means of an improved line detection strategy. With a similar goal of providing simplified environment maps for e.g. navigation, Xiao et al. [19] employ constructive solid geometry (CSG) operations to generate a volumetric wall model. Room topology is not explicitly modeled.
Many recent methods approach the reconstruction problem in a 2.5D setting, including a segmentation into separate rooms. Mura et al. [20, 11] model buildings as 2.5D polyhedral meshes by means of constructing a 2D line arrangement and performing -medoid clustering based on diffusion embeddings. Mura et al. [21] also propose a related approach which allows arbitrary wall orientations and performs recursive clustering on a constrained Delaunay tetrahedralization. The method by Turner et al. [22] provides efficient means to generate 2.5D, textured meshes for e.g. navigation purposes including a room segmentation obtained by Graph-Cut in a triangulated environment map. An extension providing enhanced texture mapping has been presented in [3]. The reconstruction method by Wang et al. [15] models outer and inner walls by means of 2D line arrangements labeled using diffusion embeddings similar to [11]. They also reconstruct doors using a simulated ray casting approach. Murali et al. [6] present a system to quickly generate BIM models from mobile devices such as Google Project Tango, Microsoft Kinect or Microsoft HoloLens, including semantic annotations and relations between reconstructed elements. The approach is currently limited to single-story, Manhattan world buildings.
Few approaches consider the more general case of slanted walls or ceilings. Mura et al. [4] reconstruct polyhedral room boundaries with arbitrary wall and ceiling orientations. Early rule-based classification of detected elements helps pruning invalid parts. Room segmentation is performed by clustering steered by visible surface overlap. Mura et al. [23] propose an extension using automatically clustered synthetic viewpoints and show applicability on complex multi-story buildings.
None of the aforementioned methods reconstruct volumetric wall and slab elements which are directly usable in a BIM setting. Few methods have approached this problem before. Stambler et al. [24] aim to generate volumetric 3D building models using learning approaches for the classification and scoring of detected elements, and simulated annealing for optimizing the overall model. The approach makes strong assumptions about the input data, requiring both interior and exterior scans, as well as scanner positions. Thomson et al. [25] generate volumetric walls from point clouds by detecting planes using a RANSAC approach and fitting suitable IFC wall entities to the detected surfaces; room volumes and topology are not explicitly modeled. They also propose a point cloud segmentation scheme based on a corresponding IFC model. A method which explicitly represents buildings as interconnected volumetric wall elements has been presented by Ochmann et al. [5]. They construct a 2D line arrangement of wall center lines representing pairs of opposing wall surfaces and perform a room labeling of the arrangement faces by means of a Graph-Cut based multi-label energy minimization. Multi-story buildings are not supported. Macher et al. [14] propose a semi-automatic reconstruction approach by first segmenting the input data automatically and exporting the result in an interim OBJ format, and subsequently constructing an IFC file with manual intervention in a post-processing step.
To our knowledge, our approach is the first to combine general multi-story, multi-room reconstruction with fully volumetric room and wall entities.
2.2 Abstraction, segmentation, and reconstruction
We now highlight some loosely related approaches which pursue more general or complementary goals which may be beneficial for tackling the reconstruction problem on different levels. Monszpart et al. [26] represent man-made scenes (e.g. buildings) by a regular arrangement of planes, taking into account non-local inter-primitive symmetry relations. Such a regularization may be useful for various arrangement-based reconstruction approaches. A method for reconstructing lightweight, manifold, polygonal boundary models from point clouds has been presented by Nan et al. [27]. They employ an inside/outside labeling approach using binary linear programming. Jung et al. [28] generate watertight floor maps by means of skeletonization in a 2D binary occupancy map with subsequent labeling of separate rooms. A 3D room partitioning approach using anisotropic potential fields with subsequent unsupervised clustering has been presented by Bobkov et al. [29]. Pursuing a similar goal, Ochmann et al. [30] perform a segmentation of indoor point clouds into separate rooms using a visibility-based approach. Openings between neighboring rooms are detected to obtain a room connectivity graph. Bassier et al. [31] employ a machine learning approach to classify structural elements such as walls, floors, ceilings, and beams in point cloud data. The method by Liu et al. [17] generates topologically and geometrically consistent floor plans from 2D raster images using an integer programming approach. While the approach is designed to work on 2D data and assumes Manhattan world geometry, the idea to enforce global properties of the resulting model using integer programming is related to our work. Focusing on non-structural elements relevant to BIM models, Adán et al. [32] present an approach for detecting various important entities such as sockets, switches, signs, and safety-related items. While the method by Son et al. [33] does not explicitly model a building’s room topology, they detect various important volumetric elements such as walls, slabs, columns and beams, also taking into account material properties and relations between elements.
2.3 Applications
Automated scan-to-BIM methods facilitate a range of diverse applications in different areas such as construction surveillance, facility management, or energy simulations. Garwood et al. [34] propose a framework for storing building geometry in a format suitable for e.g. energy simulation and verification tasks, and highlight the importance of fast, automated methods for obtaining suitable models. Hyland et al. [35] propose the usage of open standards and automatically derived BIM models from measurements for performing automated compliance control by comparing the as-built and as-designed states of buildings. In a similar spirit, O’Keeffe et al. [36] have developed validation approaches for determining and analyzing differences between scans and BIM models. A prototypical approach has been presented by Brodie et al. [37] who propose a cloud-based platform integrating tools for generating models from and validating models against point clouds. Krispel et al. [38] developed a method for automatic detection of power sockets and for the generation of hypotheses for electrical lines based on automatically generated building models. An approach for integrating IFC BIM models and point cloud data in a common file format has been presented by Krijnen et al. [39]. They highlight the semantically meaningful association of both worlds for documentation, structuring, annotation, synchronization and retrieval tasks.
3 Overview
The input of our approach is a 3D indoor point cloud (Figure 1 a) with oriented normals whose “up” direction is assumed to be the -axis. If normals are not yet available, they are estimated by local Principal Component Analysis (PCA).
We first detect planes using an efficient RANSAC implementation [40] (Figure 1 b) and compute occupancy bitmaps for each detected plane from the respective supporting points.
The detected planes are used to automatically eliminate outlier points, and to determine point clusters corresponding to individual rooms. This clustering is performed by means of Markov Clustering [41] which does not require prior information about the number of rooms and results in a labeling of the point cloud (Figure 1 c).
The resulting point labels are projected to the previously detected planes and discretized into multi-label bitmaps. Planes are pruned, rectified, clustered, and classified as candidates for vertical wall or horizontal slab surfaces (Figure 1 d; only vertical surfaces shown for visualization purposes). Since we base our reconstruction on volumetric walls and slabs instead of single surfaces, pairs of nearby, approximately parallel surfaces are grouped to wall and slab candidates.
Based on promising previous approaches (e.g. [4, 23, 5, 13]), we then derive a three-dimensional arrangement of planes from the set of wall and slab candidates (Figure 1 e). To this end, all surfaces are interpreted as infinite planes and intersected with each other which results in a segmentation of 3D space into convex polyhedral cells. In particular, each wall and slab candidate is represented by a set of cells located between the respective two candidate surfaces. Priors for the existence of different rooms and wall surfaces are estimated for each 3D cell and 2D face using the labeled surface candidates.
The main step of our approach is to find a labeling of all cells such that each cell is either assigned to a room, or outside space. Additionally, volumetric walls must be placed wherever a transition between inside and outside space takes place which is also modeled as part of the labeling problem. The labeling should faithfully conform to the measured data and simultaneously fulfill certain constraints (e.g. wall connectivity) to ensure a plausible resulting model (Figure 1 f).
Formulating this task as an optimization problem requires three parts: First, we define a space of possible solutions with meaningful priors to guide the solver. The geometry of this space is given by the arrangement of planes. Priors for locations of rooms and walls in the cell complex are derived from the measured data. Second, we need to define constraints to restrict the feasibility of a solution. They enforce that any solution satisfies predefined rules, e.g. a room and outside space must be separated by a wall. Third, an objective function for assessing the quality of a solution is formulated as a cost function which is minimized under the given constraints.
After a solution is found it can easily be converted into a format suitable for rendering or exporting, e.g. an IFC file or a mesh, by considering the cell labeling and the boundaries between differently labeled cells.
4 Method
In this Section, we provide details regarding each of the steps involved in our approach with a focus on the formulation as an optimization problem.
4.1 Plane detection
Based on the widely used assumption that the coarse geometry of most buildings can be represented (or sufficiently approximated) by piecewise planar surfaces, a crucial first step of our approach is the detection of planes in the point cloud data. To this end, an efficient RANSAC approach [40] implemented in CGAL [42] is used. The most important parameters are maximum point-to-plane distance, normal angle threshold, minimum number of supporting points per plane, and the probability to miss the largest plane candidate. These can usually be chosen depending on point cloud data quality and used for a wide variety of datasets with similar characteristics (e.g. scanner type, density, noise level). The supporting points of each plane are projected into occupancy bitmaps on the respective plane (Figure 2 b), yielding a discretized approximation of support by measured points. Planes with low support area (estimated using the occupancy bitmaps) are pruned later (Section 4.4). Since the relatively coarse occupancy bitmaps are independent of the point cloud density, the minimum number of points for detecting a plane of the RANSAC algorithm may be set relatively low to cope with lower-resolution point clouds.
4.2 Point cloud cleaning
Real-world point clouds often contain large amounts of outlier points, often due to outside areas scanned through openings. In order to prune outlier points early in the process, we employ a simple but very effective ray casting approach similar to [5]. From each point , stochastically sampled rays , , are cast into the hemisphere oriented into the direction of the normal at point . Ray casting is performed against the occupancy bitmaps of the previously detected primitives. Let be a hit function which is if some surface was hit, and [math] otherwise. We approximate the probability that lies inside of the building as If is below a given threshold (in our experiments ), is removed from the point cloud and the occupancy bitmaps of the planes are updated. This process is iterated a small number of times.
4.3 Point cloud labeling
Priors for the locations of rooms and outside area in three-dimensional space are vital for the later optimization step, even if they are coarse estimations. We formulate the estimation of priors as a point cloud labeling problem where each label represents either a room, or the outside area.
Our proposed automatic labeling approach is based on the idea that regions of the point cloud with high mutual visibility form clusters which correspond to rooms of the building. We implement this by performing visibility tests by means of ray casting between point patches on detected surfaces which yields a visibility graph. Nodes of this graph are then clustered by means of the Markov Clustering algorithm [41] which determines natural clusters within the graph by flow simulation.
Point patches are constructed by generating coarse occupancy bitmaps for each plane and considering each occupied pixel as a patch with a normal identical to the respective plane normal. In our experiments, a patch size of was used. We use patches instead of all points to drastically reduce the number of nodes in the visibility graph which makes the computation feasible. Let be the -th patch with center position and normal . For each pair , , , ray casting between the points and , with in our experiments, is performed. If no surface is hit, the visibility between , is set to , otherwise it is set to [math]. This yields a visibility graph whose nodes are clustered using the Markov Clustering algorithm. The computed visibility is interpreted as flow between node pairs corresponding to the respective point patches. The main advantage of this method is that it is unsupervised and thus does not require a manual specification of the number of occurring labels.
As a result, we obtain disjoint clusters of patches which belong to different rooms and define the set of room labels which will be used throughout the remainder of the reconstruction process. Each point of the point cloud is assigned the room label of the respective point patch. Note that the number of room labels may be larger than the number of rooms that will actually be contained in the final reconstruction.
4.4 Surface candidates
The detected planes usually include many surfaces which are not part of walls, floors and ceilings. Even correctly detected surfaces will generally not be perfectly vertical or horizontal. We thus apply a pruning, classification and rectification step to extract two sets of candidates for wall and slab surfaces. The occupancy bitmaps are used to estimate the support area of each surface independently of point cloud density. Planes with support below an area threshold as well as planes which are not approximately vertical or horizontal are discarded. The remaining planes are classified as wall or slab surface candidates depending on their normal direction, and adjusted to be perfectly horizontal or vertical.
As a prerequisite for later room prior estimation (Section 4.7), each surface is also assigned a multi-label support bitmap with continuous values in for each room label in (Figure 2 d). This provides a soft-assignment of different regions of each surface to different room labels. The label bitmaps are generated by projecting all supporting points onto the respective surface and averaging the previously determined point labels within each pixel.
Furthermore, we dilate the support bitmaps. The rationale is that reconstructed walls with no surface support by the point cloud data are penalized by a cost function defined later in Section 4.8. Since we reconstruct wall intersections volumetrically, placing the respective wall entities in between rooms would cause high costs since surface support is naturally restricted to regions that are visible to the scanner (Figure 3, middle row). By slightly extending the surface support, we encourage construction of intersecting wall entities in regions with nearby surface support (Figure 3, bottom row).
4.5 Wall and slab candidates
Since our approach is based on the notion of volumetric walls and slabs instead of single surfaces, the next step is to determine pairs of opposing surfaces forming potential building elements. To this end, a simple pairing procedure is employed. For each surface, we search a matching, approximately parallel surface with opposing normal orientation within a user-defined distance and angle threshold. If a match is found, the two surfaces are paired to form a wall or slab candidate. It should be noted that a single surface may thus be part of multiple pairs. For surfaces without any matching counterpart, virtual surfaces with a user-defined distance are added to the set of surfaces. This is usually the case for outside walls for which only the inner side has been scanned. This augmentation is important since interior spaces are required to be bounded by volumetric walls or slabs. The generated candidates constitute the set of wall/slab labels . Which of these walls and slabs are contained in the final model is decided by the optimization described in Section 4.8.
4.6 Arrangement of planes
The geometry of the search space for finding an optimal constellation of rooms, walls and slabs is modeled as an arrangement of planes and the 3D cell complex induced thereby. It is constructed by intersecting all (infinite) planes of the wall and slab candidate surfaces with each other. Since vertical walls and horizontal slabs are treated identically, we will hereafter refer to both simply as walls.
Cells of the arrangement are convex, three-dimensional subsets of the space inside and outside of the building. Each cell belongs either to a room, or the outside area. Additionally, walls may be placed in cells that are part of the outside area. Constraints such as that a cell may belong to at most one room, or that walls may only occur in the outside area (e.g. between rooms) are formulated as constraints for the optimization problem in Section 4.8.
Faces between neighboring cells are convex, two-dimensional subsets of regions on the planes of wall surfaces. Each face may separate different regions (e.g. a room and a wall) from each other.
4.7 Volume and surface priors
For guiding the optimization, two kinds of priors are estimated from the data. First, volumetric priors for the existence of different rooms as well as outside area are estimated for each 3D cell of the arrangement. Second, support by the point cloud data is estimated for each 2D face between neighboring cells.
Preparations
The arrangement consists of cells . For two cells, the notation means that are neighboring and the normal of the separating oriented face points towards (Figure 4 a). The set of all oriented faces is denoted as . For brevity, we write instead of if the specific incident cells are irrelevant.
The set of room labels is with being the number of room clusters as introduced in Section 4.3, and the set of wall labels is with being the number of generated wall candidates as introduced in Section 4.5. We furthermore define an additional outside label . As detailed later, the outside label is used for cells that are not the interior space of a room. The union of rooms and outside labels is denoted ; the set of all labels is .
Let be the set of cells that are contained in wall candidate , i.e. all cells that are located between the two surfaces of . Conversely, is the set of walls that contain cell .
For a particular cell pair , we define the set of walls that are contained in cell but not in , i.e.
[TABLE]
The separating face is called a boundary face of the walls in . Analogously, we define the set of walls that are contained in both and , i.e.
[TABLE]
The separating face is called an inner face of the walls in . These definitions are exemplified in Figure 4 b.
It should be noted that an inner face of a wall is always the boundary face of another wall (which is often approximately perpendicular to ). As an example, in Figure 4 b, face is an inner face of wall and a boundary face of the intersecting wall . This will become important for the definition of the optimization constraints in Section 4.8.
Room and outside priors
To estimate probabilities where different rooms and outside area are located in 3D space, we estimate a volumetric prior function which returns a high value iff a label is likely to occur within a cell . To this end, we perform stochastic ray casting from sampled points in 3D space and average previously computed room labels on surfaces visible from each point. For each cell , random points are sampled within . To draw enough samples for narrow cells, which are very common due to parallel surfaces, is chosen proportional to
[TABLE]
Centered at each sampled point, rays are cast into random directions. is then set to the average over all observed room labels. Rays hitting the back side of surfaces, as well as rays without surface intersections, are counted as outside.
Face support priors
In addition to the volumetric room and outside prior function, we estimate a face support function which returns a high value iff a face is supported by the point cloud. This function is later used for selecting probable wall candidates and regularizing the optimization result. To estimate for a face , we first sample random points within where is proportional to
[TABLE]
Subsequently, all sampled points are projected onto the surface from which face was generated in the arrangement. is then set to the ratio between the number of sampled points lying within the support approximated by the occupancy bitmap of the respective surface to the total number of sampled points.
4.8 Cell complex optimization
For finding an optimal labeling of all cells, we employ a 0-1 integer linear programming approach in which binary variables for each cell are interpreted as room, outside, and wall label assignments to cells. This approach has the advantage that a set of rules to be fulfilled by any feasible solution can be formulated as hard constraints. Approximate multi-label methods based on e.g. Graph Cuts [43] are more restricted regarding the family of objective functions and constraints that can be used and may fail to find good solutions if the objective is not sufficiently smooth. We first discuss the set of constraints imposed on our model before defining the objective function.
Preparations
Each of the binary variables
[TABLE]
of our optimization is a binary assignment of a label to a cell . A value of means that the label is assigned or active. It should be noted that a cell is not necessarily assigned only a single label. In particular, a cell can be assigned the outside label and a nonempty set of wall labels at the same time as defined by the constraints below. Also, cells where walls intersect are assigned all labels of the intersecting walls. We also use the notion of inner and boundary faces as defined in Section 4.7.
Constraint 1. Each cell must be assigned exactly one label from , i.e.
[TABLE]
Constraint 2. At boundary faces of room interiors, the room label may only occur on the positive side of the separating face, i.e.
[TABLE]
as shown in Figure 4 c. Note that this constraint implies that two different room labels cannot be directly neighboring since this would violate the constraint for one of the room labels. As a consequence, this avoids “paper thin” walls between rooms since they must be separated by outside area, thereby following the physical nature inherent to walls.
Constraint 3. Wall labels may only occur in cells which are assigned the outside label, i.e.
[TABLE]
Constraint 4. The boundary faces of room interiors must also be the boundary faces of an active wall, i.e.
[TABLE]
as illustrated Figure 4 d. This constraint implies that there cannot be a transition between room interior and outside area without activating a wall at all faces where the transition occurs.
Constraint 5. At wall boundaries which occur at inner faces, the wall label must be on the negative side of the respective faces, i.e.
[TABLE]
as exemplified Figure 4 e. This constraint is a prerequisite for Constraint 6 as well as the objective function which require the left-hand side expression to be nonnegative.
Constraint 6. A wall may end at an inner face only if this face is a boundary face of at least one other active wall, i.e.
[TABLE]
as depicted Figure 4 f. This constraint enforces that walls are interconnected at their endpoints since it disallows that a wall ends at an inner face without it coinciding with a boundary face of an active wall.
Objective function
To determine the optimal labeling, we define a cost function for a solution over the for cell complex of the form
[TABLE]
consisting of the following terms. The volumetric room and outside area fitness term rewards the assignment of the most likely labels for each cell and is defined as
[TABLE]
where denotes the binary variable for the assignment of label to cell and represents the volumetric room and outside prior (Section 4.7), weighted by the volume of cell . Note that this term is included with a negative sign within such that its value is being maximized. The wall face cost terms and penalize placement of walls in terms of the required boundary and inner face areas, respectively. This penalty is attenuated for faces with high support. The terms are defined as
[TABLE]
and
[TABLE]
where are the binary variables for the assignment of the wall label to the cells respectively, is the face support prior (Section 4.7), and is the area of face . It should be noted that due to Constraint 5. Also note that in Equation 9, it suffices to consider since for a boundary face of wall , does not exist (i.e. can be considered to be zero).
We then minimize s.t. Constraints 1-6 using the Gurobi Optimizer [44].
Note that in our experiments, we added the following constraint which gave a small performance improvement although it is already implied by Constraints 1-2. At boundary faces of outside area, the outside label may only occur on the negative side of the separating face, i.e.
[TABLE]
We attribute this slight performance improvement to heuristics used by the particular optimizer implementation.
4.9 Optimization result
The result is an assignment of each cell to either one room, or the outside area. Cells which are assigned the outside area may also be assigned a nonempty set of walls. On the one hand this provides a dense segmentation of space into rooms and outside space. Volumes to which multiple walls are assigned are (volumetric) intersections of the respective walls. Since the underlying data structure provides adjacency information between all cells, semantic information like room adjacency and wall incidence is immediately available, e.g. for navigation or simulation purposes. On the other hand this information is closely related to the definition of building elements in BIM formats like IFC. This enables immediate transfer of the results into standard architecture software and integration into existing BIM pipelines.
5 Implementation details
Input point clouds were subsampled to a minimum point distance of 2 cm. Plane detection was performed using a plane distance threshold of 1 cm, a point cluster epsilon of 20 cm, a normal threshold of about ( for the “Case study 2” dataset), minimum support of 1000 points and miss probability of 0.001. Multi-label bitmaps had a resolution (pixel size) of 10 cm, occupancy bitmaps had a resolution of 20 cm. Three ray casting iterations were performed for point cloud cleaning. For automatic labeling, MCL was used with default parameters (inflation set to 2.0) in multi-threaded mode. The surface cost weight in Equation 7 was empirically chosen as 0.04. We used PCL 1.8.1 [45], CGAL 4.12 [42, 46], MCL 14-137 [41], Gurobi 8.0.1 [44], and NVIDIA OptiX 5.0 for GPU-based ray casting under Linux on a 6-core Intel i7 CPU and a NVIDIA GeForce GTX 980 GPU.
6 Evaluation
We evaluate the reconstruction quality and performance of our approach on a variety of datasets and show comparisons with groundtruth IFC and related work. Furthermore, we exemplify the flexibility of our integer linear programming approach by specifying additional constraints to modify and guide the resulting reconstruction in an intuitive manner.
Datasets
We used a variety of real-world datasets and one synthetic dataset for our evaluation. Table 2 shows six multi-story point clouds measured using terrestrial laser scanners. These datasets were provided by The Royal Danish Academy of Fine Arts Schools of Architecture, Design and Conservation (CITA). The Table lists properties of the input data including the number of points and scans, as well as quantities derived during reconstruction such as the number of room labels, extracted surfaces, wall candidates, etc. It also shows runtime measurements of the main processing steps. We also tested our approach on publicly available datasets provided by other research groups. Figure 5 shows the dataset “synth3” by the Visualization and MultiMedia Lab at University of Zurich, Figure 10 depicts the dataset “Case study 2” from the ISPRS Benchmark on Indoor Modeling [47], and Figure 9 shows the dataset “Area 3” from the Stanford 3D Large-Scale Indoor Spaces Dataset [48]. We used the latter two for demonstrating different parameters and interactive modification as described below.
Reconstruction quality
Our reconstruction approach generally worked well on the test datasets without any dataset-specific tuning. Automatic outlier removal reliably ignored even large-scale clutter scanned through windows in e.g. Datasets 1, 3, and 6. In some cases, particularly thick walls (bottom region of Dataset 1, top region of Dataset 2) were reconstructed as two thinner, parallel wall elements which may be a matter of interpretation. Increasing the maximum thickness of generated wall candidates in these cases can help recognizing such cases as single walls. A few cases of room-oversegmentation can be observed. In Dataset 4, the large central room is split into a larger L-shaped part (orange) and a smaller room (green, to the right of the building) without a real wall separating the reconstructed rooms in the point cloud data. In Dataset 5, indentations of the central room (orange) were reconstructed as small, separate rooms (cyan, purple). Since our approach currently only considers horizontal ceilings, the slanted ceiling of the staircase in Dataset 6 (yellow, elongated room) is reconstructed as a horizontal structure (see also Limitations below).
Runtime
Total runtime for the reconstruction of the test datasets lies in the range of one minute (Datasets 1, 2) to 10 minutes (Dataset 6). The runtime of primitive detection is mainly dependent on the CGAL implementation, and the time for solving the optimization problem is the runtime of the Gurobi optimizer. The runtime for auto labeling contains the time for our raycasting and clustering using the Markov Cluster Algorithm. Runtime of the optimization mainly depends on the complexity of the plane arrangement, which in turn depends on the number of detected surfaces since every surface introduces global splits in the cell complex. Therefore a tradeoff between reconstructing details (i.e. small surfaces) and computational feasibility must be made. In our experiments, we thus chose a minimum estimated area of 2 m2 for vertical surfaces and 5 m2 for horizontal surfaces.
Comparison to IFC
For Dataset 5 a corresponding, professionally made BIM model in IFC format was available. Figure 6 shows a comparison between our reconstruction and the BIM model. Colors of the reconstructed rooms were manually overlaid on the IFC model on the left-hand side. All rooms that were part of the scans mostly match the groundtruth BIM. The upper story of the building is connected to the lower story through a large horizontal opening. These areas were reconstructed as two separate rooms (red and orange) and the railing at the edge of the gallery was reconstructed as walls. The small cyan and purple rooms are an oversegmentation of the upper floor, probably due to the dilated surface support. However, this error can easily be fixed manually.
Comparison to related work
A comparison between reconstructions by the approach described in [5] and our method is shown in Figure 7. In addition to fundamental advantages of our approach such as reconstruction of multiple stories, two crucial differences are particularly notable. First, our approach results in stronger regularization of wall elements where using multiple different, similar walls to represent the building would be unnecessary. The approach in [5] leads to jumps between different, almost coplanar walls (Figure 7, center, black circles) instead of using longer, continuous walls. This can be explained by the principle that the approach separated rooms by wall center lines in 2D such that jumping from one wall to an almost coplanar wall resulted in almost no penalty in the cost function. In our case, a fully volumetric wall element would need to be added to the model to connect the parallel walls, resulting in relatively high costs. Second, the approach in [5] relies on given, separate scans and their positions for estimating an initial room segmentation. This leads to an oversegmentation of the hallway (Figure 7, center, dashed rectangle) since it tries to reconstruct one room per scan. Our method works independently of separate scans and estimates a room segmentation by unsupervised clustering.
Interactive modification
Our linear programming approach allows for additional constraints to be easily added. One example for manual post-processing of the reconstruction results by interactively adding hard constraints is shown in Figure 8. In this case the indentation of the wall surfaces on the left and right sides of the building were lost by the regularization of the model as can be seen in Figure 8, center. The user has the option to add constraints such as forcing inside area, outside area, wall, no wall, etc. by clicking at the desired location. In this case, the highlighted locations were forced to be outside area. The algorithm then finds the next best option, placing new walls that fulfill all constraints as shown in Figure 8, right. Another example is shown in Figure 9 where a hallway ends without any terminating wall surface in the input data. Since the algorithm has no wall candidate available, it cannot enclose the protruding room area. By adding a virtual wall candidate by means of simply drawing a line, the algorithm is able to include the protrusion in the reconstructed model, automatically using the perpendicular wall surfaces that are present in the input data. Different choices of the wall surface penalty weight in Equation 7 control global regularization strength. Figure 10, center shows a reconstruction where some wall and slab elements are slightly misplaced due to relatively strong surface support at windows. Increasing from our default of 0.04 to 0.08 leads to stronger regularization as shown in Figure 10, right.
Limitations
One technical limitation of our current implementation is that slanted walls, floors or ceilings are not taken into account although this is not an inherent limitation of our approach. The reason for our decision not to include these elements is that the construction of the 3D cell complex needs to be exact to guarantee the integrity of the data structure (e.g. cell neighborhood). Unfortunately, computing the cell complex in 3D induces numerical problems and currently we do not have a stable implementation for this task at our disposal. We thus opted to use the numerically stable implementation of 2D arrangements in CGAL [46] and extend it to 3D by stacking 2D arrangements separated by horizontal planes. A numerically stable extension of arrangements supporting general slanted planes would be an interesting direction for future research which we consider to be outside the scope of this paper. Processing of very large datasets may also require further optimizations to make them computationally feasible. In particular, using a global plane arrangement results in a large increase of cells and thus variables in the optimization model with every additional detected surface. More sophisticated selection of potential surfaces, and improved optimization methods, e.g. splitting the problem into smaller subproblems, are targets for further research. Last but not least, our current algorithm is not able to identify and include important architectural structures overarching the whole building like the pillars that are included in the hand-crafted model in Figure 6. Automatically identifying such structural elements and incorporating them into the automatic reconstruction is also an interesting direction for future research.
7 Conclusion and future work
We have presented a novel approach to tackle the indoor building reconstruction problem from point clouds using integer linear programming. In contrast to previous methods, our approach reconstructs fully volumetric, interconnected wall entities and room topology on multi-story buildings with weak assumptions on the input data. The resulting models are very close to the requirements needed for Building Information Modeling tasks including volumetric representations of room spaces and wall entities, and their interrelations. Additional hard constraints such as forcing or avoiding certain entities at chosen locations may simply be added as constraints of the optimization problem. We demonstrated our approach on a variety of real-world datasets.
Future work for our proposed method includes the extension of the plane arrangement data structure to support slanted surfaces and possibly non-planar primitives. Strategies for reducing computational complexity by e.g. pruning invalid surface and wall candidates early in the process would improve applicability to larger-scale datasets. Also, connecting our reconstruction methodology with e.g. opening and object detection approaches would further enrich the resulting models.
Acknowledgments
We acknowledge the Visualization and MultiMedia Lab at University of Zurich (UZH) and Claudio Mura for the acquisition of the 3D point clouds, and UZH as well as ETH Zürich for their support to scan the rooms represented in these datasets. Their datasets were used in our evaluation (Figure 5). We also used datasets provided by The Royal Danish Academy of Fine Arts Schools of Architecture, Design and Conservation (CITA) (Table 2), from The ISPRS Benchmark on Indoor Modeling [47] (Figure 10), and from the Stanford 3D Large-Scale Indoor Spaces Dataset [48] (Figure 9). This work was supported by the DFG projects KL 1142/11-1 (DFG Research Unit FOR 2535 Anticipating Human Behavior) and KL 1142/9-2 (DFG Research Unit FOR 1505 Mapping on Demand).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Sanchez and Zakhor [2012] Sanchez, V, Zakhor, A. Planar 3D modeling of building interiors from point cloud data. In: Image Processing (ICIP), 2012 19th IEEE International Conference on. IEEE; 2012, p. 1777–1780.
- 2Oesau et al. [2014] Oesau, S, Lafarge, F, Alliez, P. Indoor scene reconstruction using feature sensitive primitive extraction and graph-cut. ISPRS Journal of Photogrammetry and Remote Sensing 2014;90:68–82.
- 3Turner et al. [2015] Turner, E, Cheng, P, Zakhor, A. Fast, automated, scalable generation of textured 3D models of indoor environments. IEEE Journal of Selected Topics in Signal Processing 2015;9(3):409–421.
- 4Mura et al. [2016] Mura, C, Mattausch, O, Pajarola, R. Piecewise-planar reconstruction of multi-room interiors with arbitrary wall arrangements. In: Computer Graphics Forum; vol. 35. Wiley Online Library; 2016, p. 179–188.
- 5Ochmann et al. [2016] Ochmann, S, Vock, R, Wessel, R, Klein, R. Automatic reconstruction of parametric building models from indoor point clouds. Computers & Graphics 2016;54:94–103. URL: http://www.sciencedirect.com/science/article/pii/S 0097849315001119 . doi: doi:10.1016/j.cag.2015.07.008 ; special Issue on CAD/Graphics 2015. · doi ↗
- 6Murali et al. [2017] Murali, S, Speciale, P, Oswald, MR, Pollefeys, M. Indoor scan 2bim: Building information models of house interiors. In: Intelligent Robots and Systems (IROS), 2017 IEEE/RSJ International Conference on. IEEE; 2017, p. 6126–6133.
- 7Liu et al. [2018] Liu, C, Wu, J, Furukawa, Y. Floornet: A unified framework for floorplan reconstruction from 3D scans. ar Xiv preprint ar Xiv:180400090 2018;.
- 8Budroni and Boehm [2010] Budroni, A, Boehm, J. Automated 3D reconstruction of interiors from point clouds. International Journal of Architectural Computing 2010;8(1):55–73.
