Spatio-thermal depth correction of RGB-D sensors based on Gaussian   Processes in real-time

Christoph Heindl; Thomas P\"onitz; Gernot St\"ubl; Andreas; Pichler; Josef Scharinger

arXiv:1907.00549·cs.CV·July 2, 2019

Spatio-thermal depth correction of RGB-D sensors based on Gaussian Processes in real-time

Christoph Heindl, Thomas P\"onitz, Gernot St\"ubl, Andreas, Pichler, Josef Scharinger

PDF

Open Access 1 Repo

TL;DR

This paper introduces a real-time method using Gaussian Processes to correct RGB-D sensor depth inaccuracies caused by spatial and thermal effects, enhancing their reliability for computer vision and robotics applications.

Contribution

It presents a novel Gaussian Process-based calibration method that jointly models spatial and thermal influences for RGB-D sensors, enabling real-time dense depth correction.

Findings

01

Achieves accurate depth correction considering thermal effects

02

Operates in real-time leveraging GPU acceleration

03

Provides publicly available dataset and source code

Abstract

Commodity RGB-D sensors capture color images along with dense pixel-wise depth information in real-time. Typical RGB-D sensors are provided with a factory calibration and exhibit erratic depth readings due to coarse calibration values, ageing and thermal influence effects. This limits their applicability in computer vision and robotics. We propose a novel method to accurately calibrate depth considering spatial and thermal influences jointly. Our work is based on Gaussian Process Regression in a four dimensional Cartesian and thermal domain. We propose to leverage modern GPUs for dense depth map correction in real-time. For reproducibility we make our dataset and source code publicly available.

Tables2

Table 1. Table 1: \tensor ∗ [ t min p ] 𝒟 − rgb \tensor ∗ [ t p ] 𝒟 rgb \tensor*[^{p}_{t_{\mathrm{min}}}]{\mathcal{D}}{{}_{\mathrm{rgb}}}-\tensor*[^{p}_{t}]{\mathcal{D}}{{}_{\mathrm{rgb}}} RMSE before and after correction in Cartesian space across all temperature and distance captures. The correction yields an improvement by one order of magnitude.

	$x$ (mm)	$y$ (mm)	$z$ (mm)
RMSE before correction	5.7	3.7	16.0
RMSE after correction	0.7	0.5	2.2

Table 2. Table 2: Execution times for correction per depth map frame of size 640 times 480. The GPU optimized version outperforms the other variants.

	Time (s)	FPS (1/s)
CPU	20	0.05
GPU naïve	0.4	2.5
GPU optimized	0.14	7.1

Equations26

X (i, j, d) = d K^{- 1} h

X (i, j, d) = d K^{- 1} h

x

x

x^{'}

D^{'} (x_{0}^{'}, x_{1}^{'})

p (y_{⋆} ∣ X_{⋆}, X, y) = \int p (y_{⋆} ∣ F, X_{⋆}) p (F ∣ X, y) d F

p (y_{⋆} ∣ X_{⋆}, X, y) = \int p (y_{⋆} ∣ F, X_{⋆}) p (F ∣ X, y) d F

[y y_{⋆}] \sim N ([M (X) M (X_{⋆})], [Σ (X, X) Σ (X, X_{⋆})^{T} Σ (X, X_{⋆}) Σ (X_{⋆}, X_{⋆})])

[y y_{⋆}] \sim N ([M (X) M (X_{⋆})], [Σ (X, X) Σ (X, X_{⋆})^{T} Σ (X, X_{⋆}) Σ (X_{⋆}, X_{⋆})])

p (y_{⋆} ∣ X_{⋆}, X, y)

p (y_{⋆} ∣ X_{⋆}, X, y)

u_{⋆}

K_{⋆}

Σ (x_{i}, x_{j}) = σ_{s}^{2} e^{- 0.5 (x_{i} - x_{j})^{T} W (x_{i} - x_{j})} + δ_{ij} σ_{y}^{2}

Σ (x_{i}, x_{j}) = σ_{s}^{2} e^{- 0.5 (x_{i} - x_{j})^{T} W (x_{i} - x_{j})} + δ_{ij} σ_{y}^{2}

D_{⋆} (i, j) = D (i, j) + Δ_{ij}

D_{⋆} (i, j) = D (i, j) + Δ_{ij}

x_{ij} = x y z t = [D (i, j) K^{- 1} [i, j, 1]^{T} t] .

x_{ij} = x y z t = [D (i, j) K^{- 1} [i, j, 1]^{T} t] .

y_{ij} = D_{gt} (i, j) - D_{obs} (i, j) .

y_{ij} = D_{gt} (i, j) - D_{obs} (i, j) .

∥ x_{i} - x_{j} ∥^{2} = (x_{i} - x_{j})^{T} (x_{i} - x_{j}) = x_{i}^{T} x_{i} - 2 x_{i}^{T} x_{j} + x_{j}^{T} x_{j} .

∥ x_{i} - x_{j} ∥^{2} = (x_{i} - x_{j})^{T} (x_{i} - x_{j}) = x_{i}^{T} x_{i} - 2 x_{i}^{T} x_{j} + x_{j}^{T} x_{j} .

\tensor * [_{t}^{p}] D_{delta} = \tensor * [_{t_{min}}^{p}] D_{rgb} - \tensor * [_{t}^{p}] D_{rgb} .

\tensor * [_{t}^{p}] D_{delta} = \tensor * [_{t_{min}}^{p}] D_{rgb} - \tensor * [_{t}^{p}] D_{rgb} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cheind/rgbd-correction
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection · Image Processing Techniques and Applications · Advanced Vision and Imaging

MethodsGaussian Process

Full text

Spatio-thermal depth correction of RGB-D sensors based on Gaussian Processes in real-time

Christoph Heindl

Profactor GmbH, Im Stadtgut A2, 4407 Steyr-Gleink, Austria

Thomas Pönitz

Profactor GmbH, Im Stadtgut A2, 4407 Steyr-Gleink, Austria

Gernot Stübl

Profactor GmbH, Im Stadtgut A2, 4407 Steyr-Gleink, Austria

Andreas Pichler

Profactor GmbH, Im Stadtgut A2, 4407 Steyr-Gleink, Austria

Josef Scharinger

JKU Department of Computational Perception, Altenbergerstr. 69, 4040 Linz, Austria

Abstract

Commodity RGB-D sensors capture color images along with dense pixel-wise depth information in real-time. Typical RGB-D sensors are provided with a factory calibration and exhibit erratic depth readings due to coarse calibration values, ageing and thermal influence effects. This limits their applicability in computer vision and robotics. We propose a novel method to accurately calibrate depth considering spatial and thermal influences jointly. Our work is based on Gaussian Process Regression in a four dimensional Cartesian and thermal domain. We propose to leverage modern GPUs for dense depth map correction in real-time. For reproducibility we make our dataset and source code publicly available.

keywords:

RGB-D, Thermal, Depth Calibration, Gaussian Process Regression, GPU

1 INTRODUCTION

Since their introduction in 2010 active structured-light RGB-D cameras have been widely applied to computer vision and robotic applications [1]. Newcombe et al.[2] successfully employed RGB-D cameras in real-time 3D surface reconstructions. Later this work was extended by Heindl et al.[3] to generate photorealistic reconstructions by aligning depth and color streams. Combining features from depth and color images robustified object recognition[4] and led to a transition from stereo vision to RGB-D cameras as the main imaging device in indoor robotics[5].

RGB-D cameras provide simultaneous streams of color and dense depth information. Color images are typically provided by standard RGB cameras, while depth readings are generated by a separate time-of-flight or active structured infrared light sensor. Both devices are usually assembled together in a compact housing. A default factory calibration encompassing intrinsic and extrinsic camera parameters is stored within the device memory. The quality of calibration is often impractical for computer vision applications with high accuracy demands for the following reasons: the assumed pinhole camera model together with radial and tangential distortion coefficients does not capture the true underlying nonlinear deformations found in depth maps well. Additionally, strong thermal dependencies on depth readings undermine the effectiveness of a single static mono/stereo calibration.

In this paper we propose a novel calibration method that corrects depth readings based on a nonlinear model that spans the spatial and thermal domain. We frame depth correction as a probabilistic regression problem and show that Gaussian Process Regression (GPR) is well suited for capturing the underlying systematic errors. For dense real-time correction we propose to leverage the fact that GPR prediction is mostly linear algebra and is therefore well suited to be parallelized on modern GPU architectures. Python code alongside with a captured dataset is made publicly available111https://github.com/cheind/rgbd-correction.

2 RELATED WORK

The availability of modern commodity RGB-D sensors raised the interest in methods for accurate calibration thereof. Typical approaches [6, 7] use a calibration target together with intrinsic and extrinsic parameter estimation methods, such as the one proposed by Zhang [8]. Since one usually has no access to the internals of the depth generation process, these attempts are insufficient for accurate depth correction, because they merely correct for lens effects. In contrast, the work of Smisek et al. [9] assumes planarity in the observed data. They propose to correct depth by computing a pixel-wise mean residual depth image over all calibration poses. Zhang et al. [10] exploit co-planarity in the structure of the calibration target and propose to model the observed depth as a linear function of the true depth. In the work of Canessa et al. [11] a second order polynomial per pixel is proposed to compensate for depth artefacts. The closest work compared to ours, in terms of applied methods, is by Amamra et. al [12], who use a Gaussian Process to predict absolute depth from spatial locations. More recently online depth calibration methods try to compensate depth errors using a visual SLAM system[13].

Several research groups documented a strong thermal influence on the depth generation process of RGB-D sensors. Mankoff et al. [14] note a severe depth shift due to internal or external thermal changes. Fiedler et al. [15] document a nonlinear distortion effects during thermal changes. They also propose practical rules of thumb on reducing accuracy errors caused by thermal conditions.

To our knowledge we are the first to propose a practical, real-time, pixel-wise depth correction method for RGB-D cameras that considers spatial and thermal aspects jointly.

3 DEPTH CORRECTION

Throughout this work we use lower-case non-bold characters $x$ to denote scalars, bold-faced lower-case characters $\mathbf{x}$ represent column-vectors and upper-case bold characters $\mathbf{A}$ for matrices. $\mathbf{x}_{i}$ denotes the $i$ -th element of $x$ , $\mathbf{A}_{ij}$ the $i$ -th row and $j$ -th column of $\mathbf{A}$ . Functions will be written in calligraphic letters $\mathcal{F}$ .

3.1 Dense depth maps

RGB-D cameras provide depth readings in dense image form $\mathcal{D}\colon\mathbb{Z}^{2}\rightarrow\mathbb{R}$ . Depth values can be reprojected into three-dimensional Cartesian space through a function $\mathcal{X}\colon\mathbb{Z}^{2}\times\mathbb{R}\rightarrow\mathbb{R}^{3\times 1}$ given by

[TABLE]

where $\mathbf{K}\in\mathbb{R}^{3\times 3}$ is the associated camera intrinsic matrix and $\mathbf{h}=[i,j,1]^{T}$ is a point in homogeneous image space. Typically one is interested in alignment of depth and RGB images. This can be accomplished by transforming the entire depth map into the RGB camera space as follows

[TABLE]

where $\mathbf{R}\in\mathbb{R}^{3\times 3}$ and $\mathbf{t}\in\mathbb{R}^{3\times 1}$ are parameters of the rigid extrinsic transformation matrix, $\mathbf{K}^{\prime}\in\mathbb{R}^{3\times 3}$ is the intrinsic camera matrix of the target sensor. Extrinsic camera parameters are either obtained by stereo camera calibration or are directly provided by the device itself. We do not expect intrinsic nor extrinsic matrices to be very accurate for our approach to work.

3.2 Gaussian Process Regression

A Gaussian Process (GP)[16, 17] is a statistical model used in supervised machine learning for regression and classification tasks. In regression, one searches for a unknown function $\mathcal{F}(\mathbf{X})=\mathbf{y}+\epsilon$ that best approximates tuples of feature vectors $\mathbf{X}\in\mathbb{R}^{N\times d}$ , and target scalar observations $\mathbf{y}\in\mathbb{R}^{N\times 1}$ . The term $\epsilon\sim\mathcal{N}(0,\sigma_{y}^{2})$ is the usual i.i.d. Gaussian observation noise. In a probabilistic view, the expectation of the posterior predictive distribution is used to infer unknown values $\mathbf{y}_{\star}$ from feature vectors $\mathbf{X_{\star}}$ . The posterior predictive takes on the following general form

[TABLE]

where $\mathbf{y}_{\star}\in\mathbb{R}^{M\times 1}$ and $\mathbf{X_{\star}}\in\mathbb{R}^{M\times d}$ . Typically, integrating over all possible functions is intractable. Hence, Gaussian Processes make the assumption that training and predictions are jointly Gaussian

[TABLE]

where $\mathcal{M}\colon\mathbb{R}^{K\times d}\rightarrow\mathbb{R}^{K\times 1}$ is the mean function and $\Sigma\colon\mathbb{R}^{K\times d}\times\mathbb{R}^{L\times d}\rightarrow\mathbb{R}^{K\times L}$ is a positive-definite function. Then, from Gaussian conditioning rules [18] one gets the posterior predictive in closed form by conditioning on $\mathbf{X_{\star}}$ , $\mathbf{X}$ and $\mathbf{y}$

[TABLE]

The covariance function $\Sigma$ encodes a distance or similarity measure between input samples. For two samples $i$ , $j$ , $\Sigma$ is given by $\Sigma_{ij}=\sigma_{s}^{2}\mathcal{K}(\mathbf{x_{i}},\mathbf{x_{j}})+\delta_{ij}\sigma_{y}^{2}$ , with $\mathcal{K}$ being a positive definite kernel function. Multiple kernel functions are available[19]. Choosing one is highly problem specific. In this work we focus on the smooth radial basis function (RBF) a.k.a. the squared exponential kernel which, in the multidimensional input case, is given by

[TABLE]

where $\sigma_{s}^{2}$ is the signal variance regularizing the allowed deviation from the mean, $\mathbf{W}\in R^{d\times d}$ is an input domain coordinate scaling matrix, and $\sigma_{y}^{2}$ is the expected noise level. We restrict our considerations to diagonal versions of $\mathbf{W}$ . The effects of varying these parameters are illustrated in [17].

3.3 Approach for Spatio-Thermal Correction

Our approach for correcting depth maps is based on estimating a per-pixel depth correction offset. For a given depth map $\mathcal{D}$ and temperature reading $t$ we compute a corrected depth map $\mathcal{D_{\star}}$ by

[TABLE]

where $\Delta_{ij}$ is given by a Gaussian Process Regression $\Delta_{ij}=\mathcal{G}(\mathbf{x}_{ij})$ . The input $\mathbf{x}$ to the Gaussian Process $\mathcal{G}$ is given by reprojecting the pixel coordinate to Cartesian space using pixel-wise (potentially distorted) depth information and stacking the result with $t$ to create a four-dimensional input feature vector

[TABLE]

The corresponding regression target at training time $y_{ij}$ is given by the pixel-wise difference between a ground-truth depth map $\mathcal{D}_{\mathrm{gt}}(i,j)$ and observed depth map $\mathcal{D}_{\mathrm{obs}}(i,j)$ , where

[TABLE]

Regressing relative correction offsets leads to predictions that vary smoothly accross the input domain. This justifies our usage of the squared exponential kernel given in Equation 6, which has a strong prior on smooth functions. Note, this is in contrast to the approach taken in Amamra et al.[12], who regress absolute depth using a similar kernel. However, in our experience real world depth maps rarely exhibit smooth behaviour due to strong depth discontinuities along object boundaries.

At training time we sample feature vectors and regression values in the spatio-thermal domain using a regular spaced grid. At test time we compute for every four dimensional query point $\mathbf{x}_{ij}$ the optimal prediction $\Delta_{ij}=\mathbf{u}_{\star\mathrm{ij}}$ and confidence $\mathbf{K_{\star\mathrm{ij}}}$ according to Equation 5. In turn $\Delta_{ij}$ is used to construct the corrected depth map $\mathcal{D_{\star}}(i,j)$ according to Equation 7.

4 EVALUATION

This section covers the experiments which have been performed to prove the applicability of the approach.

4.1 Hardware Setup

Figure 1 is a schematic overview of the hardware set-up used for data acquisition. As RGB-D sensor an Orbbec Astra Mini S222https://orbbec3d.com/astra-mini/ is used, since it is a typical commodity sensor without special housing. The sensor looks towards a chessboard calibration pattern and is mounted on an electronic linear axis to adjust the distance. Directly connected to the heat sink of the sensor is a Peltier element which is used for heating and cooling. Due to the fact that all temperature-critical elements of the sensor are connected to the heat pipe, it is assumed that thermostabilization of the heat pipe directly controls the temperature of the main camera components, e.g image sensors and projector.

The sensor of the temperature control loop is directly thermally coupled to the heat pipe, which is of special importance. In a future application, no control loop is needed and the sensor is the only additional hardware which is required to apply the proposed correction approach. Since the housing of an Orbbec Astra Mini S serves as heat pipe, there is no need to modify any sensor hardware in an application.

The evaluation itself is run on a workstation with an Intel i7-7700 @ 3.60 GHz, 16 GB RAM, and a NVIDIA GeForce GTX1080 graphics card with 6 GB memory.

4.2 Spatio-thermal RGB-D dataset

Using the hardware setup described above we captured the following data: for each temperature level, ranging from 10 to 35 *∘*C in steps 1 *∘*C, we moved the axis to six different positions spaced 10 cm apart. The covered distance between sensor and calibration target ranges from 0.4 to 1 meter, which matches the operating specifications of the sensor. At each calibration position we captured 50 RGB and 50 depth frames. Depth data was captured aligned with color data as described in Section 3.1. From each color frame an artificial dense depth map was computed from known object geometry and planarity assumptions.

4.3 Implementation Details

For numerical stability it is advised to avoid performing the direct matrix inversion of the covariance block matrix in Equation 5[16]. A robust alternative is to pre-factor the positive-definite covariance matrix using Cholesky decomposition $\Sigma(\mathbf{X},\mathbf{X})^{-1}=\mathbf{L}\mathbf{L}^{T}$ , and then use $\mathbf{L}$ and $\mathbf{L}^{-1}$ for subsequent calculations.

For our parallel GPU implementation we leverage the TensorFlow[20] framework to obtain optimized tensor-products on GPUs. Computing the posterior predictive from Equation 5 decomposes nicely in matrix products. Additionally, the kernel formulation is already partly vectorized, see Equation 6. Also note, that $\mathbf{L}$ and $\mathbf{L}^{-1}$ can be pre-computed on the CPU for a given training data set. The GPU then reads these values as constant matrices into GPU memory.

We experimented with different versions of computing the kernel function. We found that the type of functions used had a heavy impact on execution speed. The best performance improvement compared to running on CPU was found to be one that makes explicit use of the following equivalence

[TABLE]

Additionally preloading data onto the GPU, while computing correction for previous batches, helps significantly in reducing GPU stalling due to I/O waits. For details consult our source code.

5 RESULTS

In order to validate our findings, we captured depth data according to the procedure described in Section 4.2. The sensor has undergone a standard intrinsic and extrinsic calibration before capturing.

For the remainder of this work we refer to the device provided pixel-wise mean depth map at axis position $p$ , temperature $t$ by $\tensor*[^{p}_{t}]{\mathcal{D}}{{}_{\mathrm{ir}}}$ . The mean depth map is computed by averaging over multiple captures (50 in our case) per axis position and temperature. Similarly, we refer to the artificial generated mean depth map from color images by $\tensor*[^{p}_{t}]{\mathcal{D}}{{}_{\mathrm{rgb}}}$ . Note that we assume $\tensor*[^{p}_{t}]{\mathcal{D}}{{}_{\mathrm{ir}}}$ to be pre-aligned with the RGB device by means of methods presented in Section 3.1.

Figure 2 illustrates the thermal influences on depth readings for a fixed position $\hat{p}$ . The two plots are made from two small opposite image regions (top-left, bottom-right). One can clearly see the depth drift associated with $\tensor*[^{\hat{p}}_{t}]{\mathcal{D}}{{}_{\mathrm{ir}}}$ is up to 4-5 times larger than the depth drift of $\tensor*[^{\hat{p}}_{t}]{\mathcal{D}}{{}_{\mathrm{rgb}}}$ . While the depth from RGB drift appears linear with increasing temperature, the depth readings from $\tensor*[^{\hat{p}}_{t}]{\mathcal{D}}{{}_{\mathrm{ir}}}$ behave highly nonlinear. Additionally, the curves tend to flatten as we decrease the distance $p$ to the calibration target and escalate as we move further away from it.

For Gaussian Process Regression training, we sample input feature vectors $\mathbf{X}$ from a regularly spaced Cartesian 3D grid in meters, every three degrees Celsius. The corresponding target values $\mathbf{y}$ are then computed from the depth difference map given by

[TABLE]

That is, we always calibrate with respect to depth from color at the lowest possible temperature level. In total our training data consists of $\mathbf{X}\in\mathbb{R}^{5000\times 4}$ elements and $\mathbf{y}\in\mathbb{R}^{5000\times 1}$ elements. We then auto-tune the kernel hyper-parameters $\begin{bmatrix}\mathbf{W},\sigma_{y},\sigma_{s}\end{bmatrix}$ using gradient descent of the negative log-marginal-likelihood[17], yielding the following optimal parameters for our dataset: $diag(\mathbf{W})$ is given by $\begin{bmatrix}2.0,0.04,1.29,0.002\end{bmatrix}$ , $\sigma_{y}=0.044$ and $\sigma_{s}=0.031$ for our dataset. As expected, our formulation of $\tensor*[^{p}_{t}]{\mathcal{D}}{{}_{\mathrm{delta}}}$ results in high importance along the temperature domain, making samples far apart in the temperature domain still influential on the regressed depth correction.

Applying the proposed Gaussian Process Regression reduces the error significantly. The plots in Figure 3 show the pixel-wise error $\tensor*[^{p}_{t_{\mathrm{min}}}]{\mathcal{D}}{{}_{\mathrm{rgb}}}-\tensor*[^{p}_{t}]{\mathcal{D}}{{}_{\mathrm{rgb}}}$ before and after application of the regression for a selected set of positions and temperatures. For areas of missing depth data no predictions can be made. Those pixels appear as white speckles in the illustrations. A numerical evaluation spanning the entire distance and temperature range is provided in Table 1. One can see that the correction yields an improvement by one order of magnitude.

Table 2 compares execution times for CPU and GPU variants on dense depth maps consisting of over 300,000 points as described in Section 4.3. The GPU variants clearly outperform an optmized CPU implementation. When possible, one should prefetch data to reduce wait times and enable real-time performance.

To show generality of our findings we captured and evaluated data from two equivalent sensors. However, the results differed only marginally.

6 CONCLUSION AND FUTURE WORK

In this paper we demonstrated the thermal influence on active structured-light RGB-D cameras based on Orbbec Astra Mini S sensors. A novel depth-delta calibration method, based on probabilistic regression in the spatial and thermal domains, was presented. We showed that Gaussian Progress Regression is well suited to correct depth errors under spatio-thermal changes. The comprehensive evaluation of the proposed approach on a novel dataset demonstrates a reduction of the root mean squared error by one order of magnitude. A further comparison of CPU and GPU implementations proves the real-time performance capabilities of our approach. The dataset and source code was made publicly available.

Possible future work concerns the evaluation of sensors such as time-of-flight based models. Regarding performance, a scale-space approach could enable real-time performance directly on embedded systems. Additionally, fixed point or half-precision variants could be examined to further relax computational runtime requirements.

7 ACKNOWLEDGEMENTS

This research is funded by the projects Lern4MRK (Austrian Ministry for Transport, Innovation and Technology), and AssistMe (FFG, 848653), as well as the European Union in cooperation with the State of Upper Austria within the project Investition in Wachstum und Beschäftigung (IWB). The authors want to thank Mr. Gerhard Ebenhofer for fruitful discussions and help during the data acquisition.

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Zhang, Z., “Microsoft Kinect sensor and its effect,” Multi Media 19 (2), 4–10 (2012).
2[2] Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohi, P., Shotton, J., Hodges, S., and Fitzgibbon, A., “Kinectfusion: Real-time dense surface mapping and tracking,” in [ 10th International Symposium on Mixed and Augmented Reality (ISMAR) ], 127–136, IEEE (2011).
3[3] Heindl, C., Akkaladevi, S. C., and Bauer, H., “Capturing photorealistic and printable 3D models using low-cost hardware,” in [ International Symposium on Visual Computing (ISVC) ], 507–518, Springer (2016).
4[4] Lai, K., Bo, L., Ren, X., and Fox, D., “Sparse distance learning for object recognition combining RGB and depth information,” in [ International Conference on Robotics and Automation (ICRA) ], 4007–4013, IEEE (2011).
5[5] El-laithy, R. A., Huang, J., and Yeh, M., “Study on the use of Microsoft Kinect for robotics applications,” in [ Position Location and Navigation Symposium (PLANS) ], 1280–1288, IEEE/ION (2012).
6[6] Konolige, K. and Mihelich, P., “Technical description of Kinect calibration. 2012,” (2012). http://wiki.ros.org/kinect_calibration/technical.
7[7] Burrus, N., “Kinect calibration,” (2011). http://nicolas.burrus.name/index.php/Research/Kinect Calibration.
8[8] Zhang, Z., “A flexible new technique for camera calibration,” IEEE Transactions on pattern analysis and machine intelligence (TPAMI) 22 (11), 1330–1334 (2000).