On the covariance of X in AX = XB
Huy Nguyen, Quang-Cuong Pham

TL;DR
This paper derives a rigorous method to compute the covariance of the hand-eye calibration transformation in robot vision, enhancing the understanding of uncertainty in robot perception tasks.
Contribution
It introduces a covariance propagation approach in SE(3) for the AX=XB problem, providing precise uncertainty estimates for the transformation.
Findings
Accurately predicts covariance of hand-eye transformation.
Validated with synthetic and real data.
Offers a tool for high-precision robot perception applications.
Abstract
Hand-eye calibration, which consists in identifying the rigid- body transformation between a camera mounted on the robot end-effector and the end-effector itself, is a fundamental problem in robot vision. Mathematically, this problem can be formulated as: solve for X in AX = XB. In this paper, we provide a rigorous derivation of the covariance of the solution X, when A and B are randomly perturbed matrices. This fine-grained information is critical for applications that require a high degree of perception precision. Our approach consists in applying covariance propagation methods in SE(3). Experiments involving synthetic and real calibration data confirm that our approach can predict the covariance of the hand-eye transformation with excellent precision.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Robotics and Sensor-Based Localization
On the covariance of in
Huy Nguyen and Quang-Cuong Pham The authors are with the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore 639798. Corresponding author: Huy Nguyen (email: [email protected]).
Abstract
Hand-eye calibration, which consists in identifying the rigid-body transformation between a camera mounted on the robot end-effector and the end-effector itself, is a fundamental problem in robot vision. Mathematically, this problem can be formulated as: solve for in . In this paper, we provide a rigorous derivation of the covariance of the solution , when and are randomly perturbed matrices. This fine-grained information is critical for applications that require a high degree of perception precision. Our approach consists in applying covariance propagation methods in SE(3). Experiments involving synthetic and real calibration data confirm that our approach can predict the covariance of the hand-eye transformation with excellent precision.
Index Terms:
Hand-eye calibration, Uncertainty, Calibration and Identification
I Introduction
Hand-eye calibration, which consists in identifying the rigid-body transformation between a camera (eye) mounted on the robot end-effector and the end-effector (hand) itself, is a fundamental problem in robot vision. Mathematically, this problem can be formulated as: solve for in , where is the unknown hand-eye transformation matrix and and are known transformation matrices (see details in Section II-A). Starting from the late 1980’s, a large amount of literature has been devoted to this problem, and a number of efficient methods have been developed, see e.g. [1, 2, 3, 4, 5, 6].
In this paper, we are interested, not merely in solving for , but more comprehensively, in evaluating the covariance of from those of and , where and are now randomly perturbed transformation matrices. This fine-grained information is critical in high-precision robotics applications for several reasons.
Motivations
The uncertainty of the object pose estimation comes from three main sources: (i) the uncertainty of the object pose estimation in the camera frame, (ii) the uncertainty of the hand-eye calibration, and (iii) the uncertainty of the robot end-effector positioning. In practice, source (ii) arguably contributes the most: for instance, a tiny orientation error of 0.05 degrees in the hand-eye calibration already implies an error of 0.6 mm in object position if the latter is 70 cm away from the camera (typical viewing distance for commodity 3D cameras). In turn, having a precise knowledge of the uncertainty of the object pose estimation is critical:
- •
In high-precision manufacturing, it is important, not only to know the pose of an object, but also to guarantee that the pose estimation error is within some tolerance. For instance, when drilling holes in the fuselage of an aircraft, the hole position tolerance is 0.5 mm – which would be violated by an error of 0.05 degrees in the hand-eye calibration, even when assuming that the object pose estimation in the camera frame is perfect (see above);
- •
The precise knowledge of the object pose covariance matrix allows one to intelligently refine the object pose estimation by other perception modes. For instance, in visuo-tactile sensor fusion [7], knowing that the covariance of the object pose is comparatively large in the translation along, say, the X-axis will prompt us to touch the object along that axis in order to best reduce the uncertainty.
In addition, knowing the covariance of allows improving the calibration process itself, by e.g. choosing the appropriate number of measurements to achieve a desired level of precision, or choosing the appropriate matrices and that minimize the covariance of .
Related works
Finding the covariance of is challenging for several reasons. First, as , and represent rigid-body transformations, they live in , a subset of the space of matrices endowed with a non-trivial Lie group structure [8]. Second, how to represent and calculate uncertainties in is by itself a complex issue, which has prompted advanced mathematical developments [9]. Finally, merely solving for in is already a difficult problem [3, 4, 5], let alone evaluating the uncertainty of the solution.
There are a number of works dealing with the uncertainty of hand-eye calibration. In [10], based on a sensitivity analysis of closed-form solutions, some critical factors and criteria influencing the accuracy of the result are analyzed. For instance, one may try to maximize the angle between rotation axes of relative movement to reduce the influence on error in rotation, or to minimize the distance between the optical center of the camera and the calibration pattern to reduce the influence on error in translation. Based on this analysis, Shi et al. [11] present a algorithm to select movement pairs automatically from a series of measurements to reduce the error of the estimate. Schmidt et al. also introduce similar approach based on a vector quantization method [12]. In [13], Aron et al. present an error estimation method of the rotation part of based on an Euler angles parameterization. The authors do not discuss how that error propagates to the translation part of and their vision tracking measurements are also assumed to be noise-free. More fundamentally, the Euler angles formulation, as opposed to the formulation, is well-known to involve singularities.
The idea of estimating explicitly uncertainties in the system is by no means new. Many have studied the problem of uncertainty in the camera model (intrinsic and extrinsic parameters) [14] and the propagation of uncertainties through the camera model [9]. However, we stress that this work is different in that it focuses on the hand-eye transformation and its uncertainty.
Contribution and organization of the paper
It can be noted that none of the aforementioned works has provided a derivation of the covariance of , which is ultimately the most generic and relevant quantification of the uncertainty of the hand-eye calibration process. The goal of this paper is to rigorously work out such a derivation. Specifically, we transpose methods for forward and backward propagation of covariance [14] into the framework of uncertainty in proposed by Barfoot and Furgale [9]. The structure of the hand-eye calibration equation raises specific technical difficulties, which we shall address in detail.
The remainder of the paper is organized as follows. In Section II, we state the hand-eye calibration problem and introduce the mathematical background of the work, which includes the representation of uncertainty in , and methods for forward and backward propagation of covariance. In Section III, we present our method to estimate the rotation and translation parts of the hand-eye transformation matrix and their associated covariance matrices. In Section IV, we show that the method can indeed predict with excellent precision these covariances in synthetic and real calibration datasets, and uses this information to compute the covariance of the object pose estimation in a real setting. Finally, in Section V, we conclude by discussing the advantages and drawbacks of our approach and sketch some future research directions.
II Background
II-A Formulation of the hand-eye calibration problem
The classical hand-eye calibration method consists in looking at a fixed pattern from two different viewpoints, say 1 and 2, giving rise to the following equation
[TABLE]
where
- •
is the transformation of the end-effector with respect to the fixed robot base at configuration ;
- •
is the constant transformation of the camera with respect to the end-effector;
- •
is the transformation of the pattern (object) with respect to the camera at configuration ;
- •
is the constant transformation of the pattern with respect to the robot base (see Fig. 1).
Next, one can transform the above equation into
[TABLE]
which has the form of , where is the unknown hand-eye transformation, and and can be computed from respectively the robot kinematics and pattern pose estimation [3]. Next, if the fixed pattern is viewed from a large number of viewpoints, one can collect many different ’s and ’s. Suppose that we have a set of measurements . Since in practice these measurements are perturbed by actuator/sensor noise, the exact solution for the set of equations will not exist. Instead, the problem is commonly framed as an optimization problem in which is found as the transformation that “best” fits the equalities.
Note that sometimes the camera may not be mounted on the end-effector but on a fixed stand. In this case, finding the relative transformation between the camera and the robot base can also be formulated as the problem and can be treated by the same method.
II-B *Representation of rigid-body transformations and of their
uncertainties*
We choose to represent rigid-body transformations as elements of the Special Euclidean group [8]. To model the uncertainty on , we adopt the framework proposed in [9]. As there is in general no bi-invariant distance on [15], solving for the rotation and translation components of simultaneously would in any case require an arbitrary rotation/translation weighting. Instead, we choose to solve them separately, which entails a number of simplifications [3]. As a consequence, the uncertainties of the rotation and the translation parts are also modeled separately.
Specifically, we assume that the rotation parts of the observations and are corrupted by Gaussian noise as follows
[TABLE]
where are the means of , and are zero-mean Gaussian perturbations with covariance matrices , respectively.
The translation parts of the and are corrupted as follows
[TABLE]
where are the means of , and are zero-mean Gaussian perturbations with covariance matrices , respectively.
Note that the above assumptions imply that rotation and translation noises are independent.
II-C Forward and backward propagation of covariance
Forward propagation. Let be a random vector in with mean and covariance matrix . Consider a function that is differentiable in a neighbourhood of . Then, at the first order of approximation, is a random variable with mean and covariance matrix
[TABLE]
where is the Jacobian matrix of at .
Backward propagation. Assume now that (the parameter) is unknown, but that (the measurement) is known and determined to be a random variable with mean and covariance matrix . Then the best estimate for is given by
[TABLE]
To estimate the covariance of , one can approximate by an affine function , which yields
[TABLE]
Using the weighted pseudo-inverse, one has
[TABLE]
From (7), the covariance of can now be approximated at the first order by
[TABLE]
In practice, when performing an iterative least-squares optimization, one can use (10) at the last iteration to obtain the estimation of the covariance of .
Note that the quality of the approximations given by Equations (7) and (10) depends in particular on the quality of the linear approximation of .
III Derivation of the covariance of
Equation can be decomposed as
[TABLE]
where denote respectively the rotation and translation parts of .
III-A Covariance of the rotation part of
We first consider the rotation part of . Let denote the logarithms of and respectively, i.e.
[TABLE]
Note that the covariance matrices of and can be obtained by applying the forward propagation of covariance
[TABLE]
where denotes the (left) Jacobian of at , see [9] for more details.
Next, via logarithm mapping, equation (11) can be written as
[TABLE]
Applying the rule for and , one has
[TABLE]
In order to use the uncertainty model in , we define a random variable that represents the difference between and the current estimate by
[TABLE]
Next, to apply the backward propagation of covariance, one needs the measurement vectors and to appear on the same side of the equation. To achieve this without making it too complex, we use a trick from [14], which consists in “copying” the ’s on both sides, as follows
[TABLE]
Now, the measurement vector is given by , where , and the parameter vector is given by .
Since the noise of ’s and ’s are independent ( is caused by robot kinematics while is caused by object pose estimation in the camera frame), the covariance matrix of the measurement vector is given by
[TABLE]
with .
Now, the covariance-weighted minimization is given by
[TABLE]
This minimization problem can be solved by iteratively updating the estimate of the parameter vector by the rules
[TABLE]
where at each step the update vector is found by solving the normal equation
[TABLE]
The Jacobian of has the form
[TABLE]
[TABLE]
The set of equations (22) may now written in block form as
[TABLE]
To simplify the left-hand side of (36), let
[TABLE]
As for the right-hand side of (36), let
[TABLE]
To solve equations (36), one can left-multiply both sides by \left[\begin{array}[]{cc}\mathbb{I}&\bm{W}\bm{Z}^{-1}\\ \bf 0&\mathbb{I}\end{array}\right], which yields
[TABLE]
The above equations can now be solved to find the updating vectors and .
Applying backward propagation of covariance, a first-order approximation of the covariance of is given the following matrix, taken at the last iteration,
[TABLE]
The covariance of is given by the top-left block of , that is:
[TABLE]
III-B Covariance of the translation part of
We now consider the translation part of . Let . Equations (12) can be written as
[TABLE]
Note that the covariance matrices of can be approximated by applying the forward propagation of covariance
[TABLE]
where is the optimal rotation found in the previous section, and is the corresponding covariance.
Applying the same trick as previously, we “copy” the ’s on both sides of the equation, as follows
[TABLE]
Now the measurement vector is given by , where , and the parameter vector is given by .
Since computing the cross-variance of and would be too complex, we simply assume them to be independent. The covariance matrix of the measurement vector is then given by
[TABLE]
where .
Now, the covariance-weighted minimization is given by
[TABLE]
We solve this by iteratively updating the estimate of the parameter vector by the rules
[TABLE]
where at each step the update vector is found by solving the normal equation
[TABLE]
The Jacobian matrix has the form
[TABLE]
For the rest of the derivation, we following the same procedure as previously derived. One thus can obtain the update vectors from
[TABLE]
[TABLE]
At the last iteration, a first-order approximation of the covariance matrix of is given by
[TABLE]
IV Experiments
We now validate the proposed method by comparing the covariance predicted by the method and that obtained from Monte-Carlo simulations, using synthetic and real calibration data. Using the covariance of , we are then in a position to compute the covariance of the object pose estimation in a real setting. Our implementation is open-source and is available at https://github.com/dinhhuy2109/python-cope.
IV-A Synthetic calibration data
To generate synthetic data, we start by selecting a random transformation matrix , which serves as the true hand-eye transformation. We then generate dataset, each dataset comprising corrupted pairs . Each corrupted pair is generated as follows. First, we generate a random uncorrupted pair , which verifies exactly. Next, we add noise to and as explained in Section II-B. The covariance matrices of the noise are chosen arbitrarily as
[TABLE]
where is a scaling parameter that allows us to change the magnitude of the uncertainties.
At each noise level , we evaluate the covariance of following two methods
- •
Our method: For some dataset , we compute the covariance matrices using the proposed method (PRedicted). In fact, the are nearly identical across the datasets, so the particular value of did not matter;
- •
Monte-Carlo: For each dataset , we find the rotation and translation , that optimally fit the hand-eye equations following the proposed method. We then compute the covariance matrices , by the Monte-Carlo method across the datasets as
[TABLE]
where 111The operator turns into a member of the Lie algebra (see equation (13)), we use as the inverse operation of ., .
Fig. 2 shows projections of the covariance ellipsoids on pairs of axes for the Monte-Carlo method and our method, when . It can be noted that the proposed method provides an excellent estimation of the covariance of the hand-eye transformation.
To gauge the performance at different noise levels, we use the following metrics
[TABLE]
Fig 3 shows that our algorithm can cope well with increasing magnitudes of the measurement uncertainty. The estimation errors remain low overall, and increases slightly with the magnitude of the noise, since larger noise levels increase the number of local minima at each iteration. Note also that the errors in the covariances of the translation parts tend to be larger than that of the rotation parts. This is because, in our method, the errors in the estimation of the rotation propagate to that of the translation. Regarding the computation cost, our method is naturally several magnitude faster than the Monte-Carlo method.
It is also worth noting that the closed form solution in [3] always yields slightly higher covariance as compared to our method. This is because our method does optimally minimize the error of the estimated transformation by taking to account the measurement noise.
IV-B Real calibration data
We now validate the proposed method on actual calibration data obtained from our robot system, which consists of a 3D camera mounted on a 6-DOF industrial manipulator, as shown in Fig. 1.
IV-B1 Covariances of and in the actual system
We first need to empirically estimate the covariances of the ’s and ’s in our system, so that we can give them as inputs to our method.
As the industrial manipulator has a very high precision (0.2 mm of repeatability), we assume that the noise on the ’s is negligible.
Regarding the ’s, the are assumed to have the same noise distributions: . We experimentally collect 500 pairs of and from our system. Next, we generate M = 400 datasets, each dataset comprising pairs randomly selected from the collected pairs.
The rotation and translation errors of ’s are then computed as
[TABLE]
where the ground truth is .
Since the true transformation is unknown in the real system, we use
[TABLE]
as the ground truth. Note that estimating using our method would require information of ’s, ’s noise, therefore we use [3] instead.
After obtaining the rotation and translation errors of ’s, the empirical covariance matrices of can be estimated similarly to the equations (79,80).
IV-B2 Validation
To validate our method, we collect another 500 pairs of and from our system. We constrain the robot motion so that it covers the same area as that used for determining the noise on . We then generate datasets, each dataset comprising pairs randomly selected from the collected pairs. The covariance matrices are computed from these datasets using the Monte-Carlo method and our method, in the same manner as previously.
Fig. 4 provides projections of the covariance ellipsoids on pairs of axes shown for two methods. We see that the proposed method delivers a good estimation of the covariances. We do not believe there has been another methods of estimating uncertainty of the hand-eye transformation. Moreover, the proposed method is also relatively easy to replicate and use in practical applications.
IV-C Covariance of the object pose estimation
Using the covariance of previously obtained, we are now in a position to predict the covariance of the object pose estimation, which is our ultimate goal. Here, we demonstrate the propagation of uncertainties to the object pose estimation using the same robot system as previously (see Fig. 1).
Recall that the constant transformation of the pattern (object) with respect to the robot base is given by
[TABLE]
The covariances of and can be estimated using the procedure proposed in Section IV-B. Thus, to predict the mean and the covariance of , one needs now to estimate the covariances of .
Suppose that have the same noise distribution, i.e. . We begin by experimentally collecting 500 pairs of from our system. The rotation and translation errors of ’s are then computed similar to (82, 83), where the ground truth is . As discussed earlier, we will use (84,85) instead of the true value of . Regarding , one can transform (86) into
[TABLE]
which has the form of , where and . Hence, can also be computed in the same manner as computing .
We now collect 500 pairs of from our system. We then generate datasets, each dataset comprising pairs computed from pairs randomly selected from the 500 collected pairs.
Next, we evaluate the covariance of following two methods
- •
Our method: For some dataset , we compute the covariance matrices using the propagation method described in Appendix A (PRedicted). In fact, the are nearly identical across the datasets, so the particular value of did not matter;
- •
Monte-Carlo: For each dataset , we compute , where and are randomly selected from the collected pairs; the rotation and translation of together with their covariances are computed using our method as proposed in Section III (see equations (49, 74)). Next, the covariances are computed by the Monte-Carlo method as and where , .
Fig. 5 shows the one-standard-deviation covariance ellipsoids shown for two methods. One can see that our prediction matches very well the covariances estimated by the Monte-Carlo method.
In absolute values, the covariance of the hand-eye calibration compounds with that of the object pose estimation in the camera frame, resulting in a relatively large overall covariance for the object pose estimation in the robot frame, around 1cm in standard deviation. This again emphasizes the need of having access to the covariance of the hand-eye transformation. This fine-grained information tells us how confident we can be regarding the object pose estimation and shall also enable us to design new perception algorithms and methods for reaching higher precision, by e.g. visuo-tactile sensor fusion.
V Conclusion
In this paper, we have presented a rigorous derivation of the covariance of the solution , when and are randomly perturbed matrices. Our approach consists in transposing methods for forward and backward propagation of covariance into the framework of uncertainty in . Experiments involving synthetic and real calibration data show that our approach can predict the covariance of the hand-eye transformation with excellent precision.
While these estimates could also be provided by Monte-Carlo simulations, such a method would require collecting a large number of samples, which is not practical. Furthermore, the Monte-Carlo method yields no insights into how the uncertainties on the measurements of and propagate to the uncertainty of the hand-eye transformation. By contrast, in our method, by analyzing critical factors influencing the covariance of , for instance, based on the formulae (49) and (74), one may be able to refine the calibration process to achieve a higher precision, by e.g. determining the appropriate number of sample viewpoints or choosing their optimal distribution, which is the object of our future research.
Acknowledgment
This work was partially supported by NTUitive Gap Fund NGF-2016-01-028.
Appendix A Propagating uncertainties when rotation and translation
are decoupled
In this Section, we present our extension of the covariance propagation method of [9] to the case where rotation and translation are decoupled.
Consider two noisy poses and , whose nominal values and associated uncertainties are and respectively.
Let be the compounded pose, we have
[TABLE]
Similar to [9] (Section III), the covariance matrix of the rotation can be estimated by:
[TABLE]
with .
Regarding the translation vector, its covariance matrix can be estimated simply by using the forward propagation method of Section II-C:
[TABLE]
In summary, to compound two poses, we propagate the means using (88,89) and the covariances using (90,95).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Y. C. Shiu and S. Ahmad, “Calibration of wrist-mounted robotic sensors by solving homogeneous transform equations of the form ax= xb,” IEEE Transactions on Robotics and Automation , vol. 5, no. 1, 1989.
- 2[2] C. C. Wang, “Extrinsic calibration of a vision sensor mounted on a robot,” IEEE Transactions on Robotics and Automation , vol. 8, no. 2, pp. 161–175, 1992.
- 3[3] F. C. Park and B. J. Martin, “Robot sensor calibration: solving ax= xb on the euclidean group,” IEEE Transactions on Robotics and Automation , vol. 10, no. 5, 1994.
- 4[4] R. Horaud and F. Dornaika, “Hand-eye calibration,” The international journal of robotics research , vol. 14, no. 3, pp. 195–210, 1995.
- 5[5] K. H. Strobl and G. Hirzinger, “Optimal hand-eye calibration,” in Robotics: Science and Systems , 2006, pp. 4647–4653.
- 6[6] M. K. Ackerman and G. S. Chirikjian, “A probabilistic solution to the ax= xb problem: Sensor calibration without correspondence,” in Geometric Science of Information . Springer, 2013, pp. 693–701.
- 7[7] A. Petrovskaya and O. Khatib, “Global localization of objects via touch,” IEEE Transactions on Robotics , vol. 27, no. 3, pp. 569–585, 2011.
- 8[8] R. M. Murray, Z. Li, S. S. Sastry, and S. S. Sastry, A mathematical introduction to robotic manipulation . CRC press, 1994.
