An Equivariant Observer Design for Visual Localisation and Mapping

Pieter van Goor; Robert Mahony; Tarek Hamel; Jochen Trumpf

arXiv:1904.02452·cs.RO·June 16, 2020

An Equivariant Observer Design for Visual Localisation and Mapping

Pieter van Goor, Robert Mahony, Tarek Hamel, Jochen Trumpf

PDF

TL;DR

This paper introduces a novel equivariant observer framework for visual SLAM that unifies pose and landmark estimation, ensuring stability and robustness in non-linear visual localization tasks.

Contribution

It formulates visual SLAM as a continuous-time equivariant observer on a symmetry group, providing a new stability-guaranteed approach for pose and landmark estimation.

Findings

01

Observer error system is almost globally asymptotically stable.

02

Exponential stability in the large is achieved.

03

Decoupled Riccati-gains improve landmark estimation stability.

Abstract

This paper builds on recent work on Simultaneous Localisation and Mapping (SLAM) in the non-linear observer community, by framing the visual localisation and mapping problem as a continuous-time equivariant observer design problem on the symmetry group of a kinematic system. The state-space is a quotient of the robot pose expressed on SE(3) and multiple copies of real projective space, used to represent both points in space and bearings in a single unified framework. An observer with decoupled Riccati-gains for each landmark is derived and we show that its error system is almost globally asymptotically stable and exponentially stable in-the-large.

Equations168

Ω^{\times} := 0 Ω_{3} - Ω_{2} - Ω_{3} 0 Ω_{1} Ω_{2} - Ω_{1} 0 \in so (3) .

Ω^{\times} := 0 Ω_{3} - Ω_{2} - Ω_{3} 0 Ω_{1} Ω_{2} - Ω_{1} 0 \in so (3) .

P = (R_{P} 0 x_{P} 1) .

P = (R_{P} 0 x_{P} 1) .

U = (Ω_{U} 0 V_{U} 1) .

U = (Ω_{U} 0 V_{U} 1) .

Π_{y} := I_{3} - \frac{y y ^{⊤}}{∣ y ∣ ^{2}} .

Π_{y} := I_{3} - \frac{y y ^{⊤}}{∣ y ∣ ^{2}} .

Π_{y} = - \frac{y ^{\times} y ^{\times}}{∣ y ∣ ^{2}},

Π_{y} = - \frac{y ^{\times} y ^{\times}}{∣ y ∣ ^{2}},

\overline{Π}_{\overset{y}{ˉ}} := I_{4} - \frac{y ˉ y ˉ ^{⊤}}{∣ y ˉ ∣ ^{2}} .

\overline{Π}_{\overset{y}{ˉ}} := I_{4} - \frac{y ˉ y ˉ ^{⊤}}{∣ y ˉ ∣ ^{2}} .

[x] := {a x \vline a \in R ∖ {0}} .

[x] := {a x \vline a \in R ∖ {0}} .

A [x] := [A x]

A [x] := [A x]

\overline{Π}_{[x]} := \overline{Π}_{x} .

\overline{Π}_{[x]} := \overline{Π}_{x} .

\overline{Π}_{a x} = I_{4} - \frac{( a x ) ( a x ) ^{⊤}}{∣ ( a x ) ∣ ^{2}} = I_{4} - \frac{a ^{2}}{a ^{2}} \frac{x x ^{⊤}}{∣ x ∣ ^{2}} = \overline{Π}_{x} .

\overline{Π}_{a x} = I_{4} - \frac{( a x ) ( a x ) ^{⊤}}{∣ ( a x ) ∣ ^{2}} = I_{4} - \frac{a ^{2}}{a ^{2}} \frac{x x ^{⊤}}{∣ x ∣ ^{2}} = \overline{Π}_{x} .

\overline{p} := (p 1)

\overline{p} := (p 1)

\overline{b} \accentset [0.0pt][-1.56723pt] \circ b = (b 0)

\overline{b} \accentset [0.0pt][-1.56723pt] \circ b = (b 0)

α (p)

α (p)

α (b)

γ ([x])

γ ([x])

\displaystyle\beta(x):=\left\{\begin{array}[]{ll}x\in\mathbb{R}^{3},&\text{ if }x\in\mathbb{R}^{3}\\ \left[x\right]\in\mathbb{R}\mathbb{P}^{2}&\text{ if }x\in\mathrm{S}^{2}\end{array}\right..

\displaystyle\beta(x):=\left\{\begin{array}[]{ll}x\in\mathbb{R}^{3},&\text{ if }x\in\mathbb{R}^{3}\\ \left[x\right]\in\mathbb{R}\mathbb{P}^{2}&\text{ if }x\in\mathrm{S}^{2}\end{array}\right..

R^{3} ⊔ S^{2} \ignorespaces \ignorespaces \ignorespaces \ignorespaces \ignorespaces \ignorespaces \ignorespaces \ignorespaces

R^{3} ⊔ S^{2} \ignorespaces \ignorespaces \ignorespaces \ignorespaces \ignorespaces \ignorespaces \ignorespaces \ignorespaces

T_{n} (3) = SE (3) \times R P^{3} \times \dots \times R P^{3},

T_{n} (3) = SE (3) \times R P^{3} \times \dots \times R P^{3},

(P, η_{1}, ..., η_{n}) .

(P, η_{1}, ..., η_{n}) .

⌊ P, η_{i} ⌋ := {(S^{- 1} P, S^{- 1} η_{i}) ∣ S \in SE (3)} .

⌊ P, η_{i} ⌋ := {(S^{- 1} P, S^{- 1} η_{i}) ∣ S \in SE (3)} .

M_{n} (3) = {⌊ P, η_{i} ⌋ ∣ (P, η_{i}) \in T_{n} (3)},

M_{n} (3) = {⌊ P, η_{i} ⌋ ∣ (P, η_{i}) \in T_{n} (3)},

(S^{- 1} P, S^{- 1} η_{i}) \mapsto (S^{- 1} P)^{- 1} S^{- 1} η_{i} = P^{- 1} S S^{- 1} η_{i} = P^{- 1} η_{i} .

(S^{- 1} P, S^{- 1} η_{i}) \mapsto (S^{- 1} P)^{- 1} S^{- 1} η_{i} = P^{- 1} S S^{- 1} η_{i} = P^{- 1} η_{i} .

f :

f :

((P, η_{i}), U) \mapsto (P U, 0) .

K^{- 1} (K 0_{3 \times 1}) η_{i}^{'}

K^{- 1} (K 0_{3 \times 1}) η_{i}^{'}

N_{n} (3) := R P^{2} \times \dots \times R P^{2} .

N_{n} (3) := R P^{2} \times \dots \times R P^{2} .

h

h

: (P, η_{i}) \mapsto (I_{3} 0) P^{- 1} η_{i} .

SOT (n) = {(R 0 0 a) \vline R \in SO (n), a \in R ∖ {0}},

SOT (n) = {(R 0 0 a) \vline R \in SO (n), a \in R ∖ {0}},

(R 0 0 a) [p 1] = [R p a] = [\frac{1}{a} R p 1], (R 0 0 a) [b 0] = [R b 0] .

(R 0 0 a) [p 1] = [R p a] = [\frac{1}{a} R p 1], (R 0 0 a) [b 0] = [R b 0] .

R P_{p}^{3}

R P_{p}^{3}

R P_{b}^{3}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A Geometric Observer Design for Visual Localisation and Mapping

Pieter van Goor

Department of Electrical, Energy and Materials Engineering

Australian National University

ACT, 2601, Australia

[email protected]

& Robert Mahony

Department of Electrical, Energy and Materials Engineering

Australian National University

ACT, 2601, Australia

[email protected]

& Tarek Hamel

I3S (University Côte d’Azur, CNRS, Sophia Antipolis)

and Insitut Universitaire de France

[email protected]

& Jochen Trumpf

Department of Electrical, Energy and Materials Engineering

Australian National University

ACT, 2601, Australia

[email protected]

Abstract

This paper builds on recent work on Simultaneous Localisation and Mapping (SLAM) in the non-linear observer community, by framing the visual localisation and mapping problem as a continuous-time equivariant observer design problem on the symmetry group of a kinematic system. The state-space is a quotient of the robot pose expressed on $\mathbf{SE}(3)$ and multiple copies of real projective space, used to represent both points in space and bearings in a single unified framework. An observer with decoupled Riccati-gains for each landmark is derived and we show that its error system is almost globally asymptotically stable and exponentially stable in-the-large.

1 Introduction

Simultaneous Localisation and Mapping (SLAM) is a well-known problem in mobile robotics and has been an active area of research for the last 30 years [9]. Visual localisation and mapping refers to the particular case of the SLAM problem where the only exteroceptive sensors available are cameras. The visual localisation and mapping problem, and particularly the case where only a single monocular camera is available, continues to be of substantial interest due to the low cost and low weight, as well as the ubiquity of single camera systems [9]. While visual localisation and mapping is an established research topic with a rich history [7], it remains an active research topic, especially in the area of low-cost light-weight embedded systems [8]. State-of-the-art filters and observers approach the SLAM problem through linearisation, and do not deal well with poor initial estimation or choice of linearisation point [7]. Additionally, these methods suffer from high computational complexity and poor scalability [9, 19].

Both the SLAM and visual localisation and mapping problems have attracted interest recently in the non-linear observer community. Approaches to these problems have emerged from earlier work on attitude estimation [17, 5] and pose estimation [2, 20, 14]. Bonnabel et al. [3] exploited a novel Lie group to design an invariant Kalman Filter for the SLAM problem. Parallel work by Mahony et al. [18] developed the same Lie group and proposed a quotient manifold structure for the state-space of the SLAM problem. Work by Zlotnik et al. [21] derives a geometrically motivated observer for the SLAM problem that includes estimation of bias in linear and angular velocity inputs. For the visual localisation and mapping problem, where only bearing measurements are available, Lourenco et al. [15, 16] proposed an observer with a globally exponentially stable error system using depths of landmarks as separate components of the observer. Grabe et al. [10] derived a non-linear observer for the case where a significant number of the bearings measured are of coplanar landmarks by using the instantaneous homography constraint. Bjorne et al. [4] uses an attitude heading reference system (AHRS) to determine the orientation of the robot, and then solves the SLAM problem using a linear Kalman filter. A similar approach to the visual localisation and mapping case is undertaken in [6]. Hamel et al. have also introduced a Riccati observer [12] for the case where the orientation of the robot is known.

In this paper we present a novel non-linear geometric observer for the visual localisation and mapping problem. The approach extends the SLAM manifold presented in [18] to include bearings (such as magnetometer or gravity measurements) and landmark points in the same formulation by exploiting the structure of the real-projective space $\mathbb{R}\mathbb{P}^{3}$ and homogeneous coordinates for bound and free vectors. The proposed $\mathbb{R}\mathbb{P}^{3}$ state-space also allows modelling of visual features as a simple linear projection of $\mathbb{R}\mathbb{P}^{3}$ onto $\mathbb{R}\mathbb{P}^{2}$ . A novel Lie group termed the $\mathbf{VSLAM}_{n}(3)$ group is introduced and shown to be a symmetry on the measurement function of the visual localisation and mapping problem. The proposed observer uses decoupled gain matrices for each landmark point that satisfy a simple Riccati equation. As a consequence of decoupling the Riccati observer for each landmark, the computational complexity of our approach is only $\mathcal{O}(n)$ . Finally, the innovation on the pose of the robot is determined through finding the minimum of a novel cost function on the tangent space of $\mathbb{R}\mathbb{P}^{3}$ , and is based on the static environment assumption common in SLAM algorithms. The resulting observer is shown to have an error system that is almost globally asymptotically stable (the basin of attraction excludes a set of measure zero) and exponentially stable in-the-large (exponentially stable on any compact set contained in the basin of attraction).

This paper consists of five sections alongside the introduction and conclusion. Section 2 introduces key notation and identities, and provides an in-depth explanation of the application of $\mathbb{R}\mathbb{P}^{3}$ to representing points and bearings in 3d space. In Section 3, we formulate the kinematics, state-space and output of the visual localisation and mapping system, and in Section 4 we introduce the new Lie group $\mathbf{VSLAM}_{n}(3)$ that acts on the state-space. In Section 5 we derive a non-linear observer on the Lie group, and in Section 6 we provide the results of a simulation. The experimental results are designed to verify the theory developed throughout the paper, not to provide a comprehensive evaluation of performance.

2 Preliminaries

2.1 Notation

The special orthogonal group and special Euclidean group are denoted $\mathbf{SO}(3)$ and $\mathbf{SE}(3)$ respecively, with Lie algebras $\mathfrak{so}(3)$ and $\mathfrak{se}(3)$ . For any $\Omega=(\Omega_{1},\Omega_{2},\Omega_{3})\in\mathbb{R}^{3}$ , the corresponding skew-symmetric matrix is denoted by

[TABLE]

This matrix has the property that, for any $v\in\mathbb{R}^{3}$ , $\Omega^{\times}v=\Omega\times v$ where $\Omega\times v$ is the vector (cross) product between $\Omega$ and $v$ .

Consider a matrix $P\in\mathbf{SE}(3)$ . The notations $R_{P}\in\mathbf{SO}(3)$ and $x_{P}\in\mathbb{R}^{3}$ are used to represent the rotation and translation components of $P$ respectively, and $P$ may be written as

[TABLE]

Likewise, for a matrix $U\in\mathfrak{se}(3)$ , the notations $\Omega_{U}\in\mathfrak{so}(3)$ and $V_{U}\in\mathbb{R}^{3}$ represent the rotational and translational velocity components of $U$ respectively, and $U$ may be written as

[TABLE]

For any $y\in\mathbb{R}^{3}\setminus\{0\}$ the projector $\Pi_{y}$ is given by

[TABLE]

The operator $\Pi_{y}$ projects vectors onto the subspace of $\mathbb{R}^{3}$ orthogonal to $y$ . The projector and the skew-symmetric matrix are related by

[TABLE]

for any $y\in\mathbb{R}^{3}\setminus\{0\}$ . For any $\bar{y}\in\mathbb{R}^{4}\setminus\{0\}$ the projector is similarly defined as

[TABLE]

2.2 Real Projective Space

For $x\in\mathbb{R}^{4}\setminus\{0\}$ , define the set of equivalence classes

[TABLE]

Given two elements $x,y\in\mathbb{R}^{4}\setminus\{0\}$ , the notation $x\simeq y$ indicates $x=ay$ for some $a\in\mathbb{R}\setminus\{0\}$ . The 3-dimensional real-projective space $\mathbb{R}\mathbb{P}^{3}=\{[x]\ \vline\ x\in\mathbb{R}^{4}\setminus\{0\}\}$ is a smooth quotient manifold [1]. For any full rank matrix $A\in\mathbb{R}^{4\times 4}$ , the operation

[TABLE]

is well-defined.

Let $x\in\mathbb{R}^{4}\setminus\{0\}$ , and define an horizontal space $H_{x}=\{v\in\mathbb{R}^{4}\;\vline\;v^{\top}x=0\}$ . Define an equivalence relationship $(x,v)\equiv(ax,av)$ for $a\in\mathbb{R}\setminus\{0\}$ between elements of $H_{x}$ and $H_{ax}$ . A tangent vector $v_{[x]}\in T_{[x]}\mathbb{R}\mathbb{P}^{3}$ is the equivalence class $[x,v]=\{(ax,av)\;\vline\;v\in H_{x}\}$ .

For any $[x]\in\mathbb{R}\mathbb{P}^{3}$ , define the projector

[TABLE]

To see this is well-defined, let $a\in\mathbb{R}$ be a non-zero scalar, and check

[TABLE]

Analogously, the projector $\Pi_{[y]}:=\Pi_{y}$ is well-defined for any $y\in\mathbb{R}\mathbb{P}^{2}$ .

Let $p\in\mathbb{R}^{3}$ be a vector representing the position of a point in space. Define the homogeneous coordinates

[TABLE]

as an embedding $\mathbb{R}^{3}\hookrightarrow\mathbb{R}^{4}$ and refer to such points $\overline{p}$ as bound vectors with foot at the origin of the reference frame and tip at the $\mathbb{R}^{3}$ point it represents. Let $b\in\mathrm{S}^{2}=\{b\in\mathbb{R}^{3}\;|\;|b|=1\}$ be a vector representing a bearing or direction and define homogeneous coordinates

[TABLE]

as an embedding $\mathrm{S}^{2}\hookrightarrow\mathbb{R}^{4}$ . We term $\mathrlap{\overline{b}}\accentset{\hbox{{\raisebox{-1.56723pt}[0.0pt][-1.56723pt]{$ \circ $}}}}{b}$ a free vector. Using these embeddings it is possible to define a map $\alpha:\mathbb{R}^{3}\sqcup\mathrm{S}^{2}\rightarrow\mathbb{R}\mathbb{P}^{3}$

[TABLE]

A point-type element of $\mathbb{R}\mathbb{P}^{3}$ is any element in the subset $\{[x]\ |\ x_{4}\neq 0\}$ . A bearing-type element of $\mathbb{R}\mathbb{P}^{3}$ is any element in the subset $\{[x]\ |\ x_{4}=0\}$ . A full inverse of $\alpha$ is not uniquely defined due to the sign ambiguity of elements of $\mathbb{R}\mathbb{P}^{3}$ . However, it is possible to define a unique map $\gamma:\mathbb{R}\mathbb{P}^{3}\rightarrow\mathbb{R}^{3}\sqcup\mathbb{R}\mathbb{P}^{2}$ by

[TABLE]

where $x_{1:3}\in\mathbb{R}^{3}$ denotes the first three elements of $x$ and $[x_{1:3}]=\{ax_{1:3}\;|\;a\in\mathbb{R}\setminus\{0\}\}$ , analogous to the $\mathbb{R}^{4}$ definition. Define a projection $\beta:\mathbb{R}^{3}\sqcup\mathrm{S}^{2}\to\mathbb{R}^{3}\sqcup\mathbb{R}\mathbb{P}^{2}$ by

[TABLE]

The following commutative diagram holds

[TABLE]

The map $\gamma$ is smooth under restriction to either point-type elements or bearing-type elements of $\mathbb{R}\mathbb{P}^{3}$ . Although $\gamma$ is unable to reconstruct the full direction vector $b$ from a bearing-type $\mathbb{R}\mathbb{P}^{3}$ element, the unsigned direction $[b]$ is sufficient for the observer construction that we undertake in the sequel.

3 Problem Formulation

3.1 VSLAM Total Space

The formulation of the total space for the VSLAM problem is an extension of the formulation in [18] to include not only points in 3D space but also bearings through their $\mathbb{R}\mathbb{P}^{3}$ representations.

Raw coordinates for the VSLAM problem can be defined by fixing an arbitrary reference frame $\{0\}$ . Let $P\in\mathbf{SE}(3)$ and $\eta_{i}\in\mathbb{R}\mathbb{P}^{3}$ represent the robot pose and landmark coordinates respectively, defined with respect to $\{0\}$ . Note that each $\eta_{i}\in\mathbb{R}\mathbb{P}^{3}$ is either point-type or bearing-type depending on whether its last entry is zero. The total space of the VSLAM problem is the product space

[TABLE]

with elements

[TABLE]

The notation $(P,\eta_{i})\equiv(P,\eta_{1},...,\eta_{n})$ is used to simplify notation in the sequel.

Given $(P,\eta_{i})\in\mathcal{T}_{n}(3)$ , recalling (2) define

[TABLE]

Given two elements $(P,\eta_{i}),(Q,\theta_{i})\in\mathcal{T}_{n}(3)$ , the notation $(P,\eta_{i})\simeq(Q,\theta_{i})$ means that $(P,\eta_{i})=(S^{-1}Q,S^{-1}\theta_{i})$ for some $S\in\mathbf{SE}(3)$ . The SLAM manifold is the set

[TABLE]

with quotient manifold structure [18].

An expression is well-defined on the SLAM manifold $\mathcal{M}_{n}(3)$ if it is invariant to the action of a rigid-body transformation of the reference frame. An important example is $(P,\eta_{i})\mapsto P^{-1}\eta_{i}$ . Given any $S\in\mathbf{SE}(3)$ , one has

[TABLE]

3.2 VSLAM Kinematics

The assumption will be made that the robot is moving through a static environment. Consider the velocity input space $\mathbb{V}=\mathfrak{se}(3)$ . The kinematics of the VSLAM system are given by the function

[TABLE]

3.3 System Output

The physical measurements taken by our robot in the VSLAM system are the bearings of landmarks. Let $\eta^{\prime}_{i}=P^{-1}\eta_{i}$ be the body-fixed frame coordinates of a landmark $\eta_{i}\neq\alpha(x_{P})$ . Using the basic pinhole camera model as described in [13] with invertible $3\times 3$ camera matrix $K$ , the measurement of $\eta^{\prime}_{i}$ taken by the camera is $\begin{pmatrix}K&\mathbf{0}_{3\times 1}\end{pmatrix}\eta^{\prime}_{i}$ . Assuming the camera is calibrated matrix $K$ , it is easy to recover the element

[TABLE]

although the scale of this element is arbitrary and cannot be known. If $\eta_{i}$ is a bearing-type element, then $\theta_{i,4}=0$ and no information is lost through the camera projection. However, if $\eta_{i}$ is a point-type element, then the scale of the vector is not recoverable. In this formulation the sign of the landmark measurement (representing whether the landmark is in front of or behind the camera) is ambiguous, but this is sufficient for the observer design undertaken in Section 5. The choice of bearing-type or point-type for a particular landmark $\eta_{i}$ is a modelling choice based on the requirements for the resulting map of the environment.

The output space of the VSLAM system is defined as

[TABLE]

The output function of the VSLAM system is defined as

[TABLE]

The output function transforms each $\eta_{i}$ into body-fixed frame coordinates, and projects the result into $\mathbb{R}\mathbb{P}^{2}$ , representing bearing-type of point-type landmark measurements with a calibrated pinhole camera.

4 Symmetry of the VSLAM Problem

4.1 Symmetry of the Total Space

We introduce a group we term Scaled Orthogonal Transformations $\mathbf{SOT}(n)$ , a subgroup of the group of similarity transforms on $\mathbb{R}^{n}$ .

Lemma 4.1.

For any $n\in\mathbb{N}$ , the set

[TABLE]

with matrix multiplication is a subgroup of $\mathbf{SIM}(n)$ .

Proof.

Assigning matrix multiplication as the group action it is clear that $\mathbf{SOT}(n)$ is the direct product of $\mathbf{SO}(3)\times\mathbb{R}_{*}$ , where $R_{*}$ is the Lie group formed by assigning multiplication as the operation on $\mathbb{R}\setminus\{0\}$ . It is straightforward to verify that $\mathbf{SOT}(n)$ is a subgroup of $\mathbf{SIM}(n)$ by considering the action $x\mapsto\frac{1}{a}Rx$ for $x\in\mathbb{R}^{n}$ . ∎

The action of $\mathbf{SOT}(3)$ on landmarks is a rotation combined with a scaling for point-type landmarks. Recalling (2) and taking advantage of the equivalence class structure of $\mathbb{R}\mathbb{P}^{3}$ ,

[TABLE]

There are exactly three orbits of $\mathbf{SOT}(3)$ acting on $\mathbb{R}\mathbb{P}^{3}$ , defined by

[TABLE]

where $x_{4}$ refers to the fourth coordinate of $x$ .

The symmetry group $\mathbf{VSLAM}_{n}(3)$ for the VSLAM problem with $n$ landmarks in 3 dimensions is defined as a Lie group

[TABLE]

with product Lie group structure. The associated Lie algebra is denoted $\mathfrak{vslam}_{n}(3)$ .

Lemma 4.2.

The mapping $\Upsilon:\mathbf{VSLAM}_{n}(3)\times\mathcal{T}_{n}(3)\to\mathcal{T}_{n}(3)$ defined by

[TABLE]

where the right-hand expression depends on definition (2), is a right group action of $\mathbf{VSLAM}_{n}(3)$ on $\mathcal{T}_{n}(3)$ .

Proof.

Trivially, $\Upsilon((I_{4},I_{4}),(P,\eta_{i}))=(P,\eta_{i})$ for any $(P,\eta_{i})\in\mathcal{T}_{n}(3)$ . Let $(A_{1},Q_{i,1}),(A_{2},Q_{i,2})\in\mathbf{VSLAM}_{n}(3)$ and $(P,\eta_{i})$ be arbitrary. Then

[TABLE]

This demonstrates that $\Upsilon$ is a right action as required. ∎

Recall the orbits of $\mathbf{SOT}(3)$ described in (4.1). Given a configuration $(P^{\circ},\eta^{\circ}_{i})\in\mathcal{T}_{n}(3)$ , let $(P,\eta_{i})=\Upsilon((A,Q_{i}),(P^{\circ},\eta^{\circ}_{i}))$ for some $(A,Q_{i})\in\mathbf{VSLAM}_{n}(3)$ . Observe that if ${P^{\circ}}^{-1}\eta^{\circ}_{j}\in\mathbb{R}\mathbb{P}^{3}_{0}$ for some $j$ , then $P^{-1}\eta_{j}\in\mathbb{R}\mathbb{P}^{3}_{0}$ also, independent of the particular element $(A,Q_{i})$ . To overcome this, in the remainder of the paper it is assumed that there is never a $j$ such that ${P^{\circ}}^{-1}\eta^{\circ}_{j}\in\mathbb{R}\mathbb{P}^{3}_{0}$ . This assumption is reasonable, in that it is equivalent to assuming there are no landmarks coinciding precisely with the origin of the robot. Additionally, it is assumed that the type of each landmark (point or bearing) is known, and the landmarks are enumerated such that $i=1,...,n_{p}$ and $i=n_{p}+1,...,n_{p}+n_{b}=n$ represent of point- and bearing-type landmarks respectively. The reduced total space is defined as

[TABLE]

and only elements $(P,\eta_{i})\in\mathcal{T}^{\circ}_{n_{p},n_{j}}(3)$ are considered from here going forward.

4.2 Lift of the VSLAM Kinematics

In order to consider the system on the $\mathbf{VSLAM}_{n}(3)$ group, the kinematics from the state space must be lifted onto the group. The following lemma provides the lift function.

Lemma 4.3.

The function $\lambda:\mathcal{T}^{\circ}_{n_{p},n_{j}}(3)\times\mathbb{V}\to\mathfrak{vslam}_{n}(3)$ , defined by

[TABLE]

where $W:\mathfrak{se}(3)\times(\mathbb{R}\mathbb{P}^{3}\cup\mathbb{R}\mathbb{P}^{3}_{p})\to\mathfrak{sot}(3)$ is given by

[TABLE]

is a velocity lift of the kinematics (3.2) onto $\mathbf{VSLAM}_{n}(3)$ with respect to the group action (10).

Proof.

To show that $\lambda$ is a velocity lift, it is required that

[TABLE]

Equivalently, it is required to show that

[TABLE]

where $W_{i}:=W(U,P^{-1}\eta_{i})$ .

First, it is necessary to show that $W$ is well-defined whenever $q\neq\bf{0}$ . To see this, let $a\in\mathbb{R}$ be any non-zero scalar, and observe that

[TABLE]

Recalling the expression for $f$ provided in (3.2), it is clear that the first terms on both sides of (11) are equal. Let

[TABLE]

In order to aid in the readability of the following equations, $q_{i}$ and $r_{i}$ in (13) are chosen such that $|q_{i}|=1$ . However, it is important to note this choice is arbitrary as shown in (4.2). To show (11), consider that

[TABLE]

using the identity (1). This further reduces to

[TABLE]

where the last step follows from (13) and the choice of $|q_{i}|=1$ . From here, (11) clearly resolves to

[TABLE]

as required. This completes that proof that $\lambda$ is a velocity lift. ∎

The kinematics of the true state $\xi=(P,\eta_{i})\in\mathcal{T}^{\circ}_{n_{p},n_{j}}(3)$ of the VSLAM system are given by

[TABLE]

Choose a reference configuration $\xi^{\circ}=(P^{\circ},\eta^{\circ}_{i})\in\mathcal{T}^{\circ}_{n_{p},n_{j}}(3)$ . By construction, the trajectories of the lifted system kinematics

[TABLE]

project to trajectories of the VSLAM kinematics (14) via $\xi(t)=\Upsilon(X(t),\xi^{\circ})$ .

5 Observer Design

5.1 Observer Kinematics

Define the observer state to lie on the VSLAM group, $\hat{X}=(\hat{A},\hat{Q}_{i})\in\mathbf{VSLAM}_{n}(3)$ , with kinematics given by

[TABLE]

where $\Delta_{\hat{X}}=(\Delta_{\hat{A}},\Delta_{\hat{Q}_{i}})\in\mathfrak{vslam}_{n}(3)$ is an innovation term. The estimated state $\hat{\xi}=(\hat{P},\hat{\eta}_{i})\in\mathcal{T}^{\circ}_{n_{p},n_{j}}(3)$ is given by

[TABLE]

Additional notation is helpful in simplifying the expressions that follow in the observer design. Define

[TABLE]

All expressions above are well-defined for equivalence classes in the SLAM manifold.

5.2 Landmark Observer

Theorem 5.1.

Let $\xi=(P,\eta_{i})\in\mathcal{T}^{\circ}_{n_{p},n_{j}}(3)$ be the true state of the system, evolving with the kinematics (14). Let $\xi^{\circ}\in\mathcal{T}^{\circ}_{n_{p},n_{j}}(3)$ be arbitrary up to the requirement that, for all $i$ , $\eta^{\circ}_{i}$ and $\eta_{i}$ are members of the same orbit of $\mathbb{R}\mathbb{P}^{3}$ under the action of $\mathbf{SOT}(3)$ . Define $\hat{X}=(\hat{A},\hat{Q}_{i})\in\mathbf{VSLAM}_{n}(3)$ to be the observer state with kinematics defined by (5.1), and define $\hat{\xi}=(\hat{P},\hat{\eta}_{i})$ as in (16).

Now, for $i=1,...,n_{p}$ , define $\Sigma_{i}\in\mathbb{R}^{3\times 3}$ by

[TABLE]

where $k_{G},k_{H}>0$ are constants, and assume that there exist $\delta>0$ and $\mu>0$ such that

[TABLE]

for any time $t>0$ and for any $i=1,...,n_{p}$ . For $i=n_{p}+1,...,n_{p}+n_{b}$ , define

[TABLE]

Then, for every landmark $i=1,...,n_{p}+n_{b}$ , define $\Delta_{\hat{Q}_{i}}$ as

[TABLE]

where $y_{i}$ and $\hat{y}_{i}$ are given by (17). Let the innovation term $\Delta_{\hat{A}}$ be given by the least-squares solution to

[TABLE]

Then the estimated state coordinates $\hat{\xi}$ converge to the true coordinates $\xi$ almost-globally asymptotically and exponentially in the large111For any compact set in the basin of attraction of the equilibrium, the value of the Lyapunov function converges exponentially to zero. up to equivalence on the SLAM manifold $\mathcal{M}_{n}(3)$ .

Proof.

To verify that $\Delta_{\hat{A}}$ is well-defined note that the cost in (5.1) is invariant to scale in the data $\hat{\theta}_{i}\mapsto a_{i}\hat{\theta}_{i}$ for $a_{i}\in\mathbb{R}\setminus\{0\}$ . A Lyapunov analysis proves the desired result.

For $i=1,...,n_{p}$ , recalling (5), define the error coordinates and candidate storage function as

[TABLE]

respectively. The condition (19) ensures that $\Sigma_{i}$ is well-conditioned, and remains bounded and positive-definite for all time $t\geq 0$ [11]. Therefore the candidate storage function $l_{i}$ is positive definite. It remains to show that $l_{i}$ is monotonically decreasing. The kinematics of $e_{i}$ are

[TABLE]

Differentiating the candidate storage function, one has

[TABLE]

where $\sigma_{m,i}$ and $\sigma_{M,i}$ denote the infinum of the smallest and the supremum of the largest eigenvalues of $\Sigma_{i}$ over time, respectively. Since $k_{H}>0$ is chosen as a constant, and $\Sigma_{i}$ remains well-conditioned and bounded, the equilibrium $e_{i}=0$ is exponentially stable. Equivalently, this provides that $\hat{P}^{-1}\hat{\eta}_{i}\to P^{-1}\eta_{i}$ globally exponentially.

For $i=n_{p}+1,...,n_{p}+n_{b}$ , define the candidate storage function

[TABLE]

Observe that $l_{i}$ is well-defined as a function of $\mathbb{R}\mathbb{P}^{2}$ elements, since the expression is invariant to multiplication of $y_{i}$ or $\hat{y}_{i}$ by any non-zero scalar. Clearly $l_{i}$ is positive definite. The kinematics of the bearing $y_{i}\in\mathbb{R}\mathbb{P}^{2}$ are given by

[TABLE]

This is well-defined as an element of the tangent space $T_{y_{i}}\mathbb{R}\mathbb{P}^{2}$ since any scaling of $y_{i}$ results in the same scaling of the expression for $\dot{y}_{i}$ . Since $\dot{y}_{i}^{\top}y_{i}=0$ , the dynamics of the norm of any chosen representative of $y_{i}$ are given by $\frac{\mathrm{d}}{\mathrm{d}t}|y_{i}|=0$ . Analogously, recalling (20) and (5.1), the kinematics of $\hat{y}_{i}\in\mathbb{R}\mathbb{P}^{2}$ are given by

[TABLE]

and hence the dynamics of the norm of any representative of $\hat{y}_{i}$ are given by $\frac{\mathrm{d}}{\mathrm{d}t}|\hat{y}_{i}|=0$ . As a consequence of this and the scale invariance of (23), we may choose $|y_{i}|=|y_{i}|=1$ for readability without loss of generality. Differentiating the candidate storage function leads to

[TABLE]

which is negative definite as long as the initial directions $y_{i}(0)$ and $\hat{y}_{i}(0)$ are not orthogonal. There are two situations in which $\dot{l}_{i}=0$ . The first one corresponds to the stable case where $l_{i}=0$ ( $\hat{y}_{i}$ and $y_{i}$ are parallel) while the second one corresponds to the unstable case for which $l_{i}=1$ ( $\hat{y}_{i}$ and $y_{i}$ are orthogonal). To prove the exponential stability in the large, suppose that $0<l_{i}\leq\epsilon<1$ for some fixed $\epsilon$ . Then,

[TABLE]

Observe that, unless $l_{i}=1$ , such an $\epsilon$ can always be found. Therefore, $l_{i}\to 0$ almost-globally asymptotically, and exponentially in the large. Since the measurement function $h$ is invertible on bearing-type elements, this provides the desired result that $\hat{P}^{-1}\hat{\eta}_{i}\to P^{-1}\eta_{i}$ almost-globally asymptotically and exponentially in the large.

Define the whole-of-system Lyapunov function

[TABLE]

From the analysis of each individual $l_{i}$ , it is clear that $\mathcal{L}\to 0$ almost-globally asymptotically and exponentially in the large. The convergence of each $\mathcal{L}$ provides that

[TABLE]

almost-globally asymptotically and exponentially in the large as well. This completes the proof. ∎

6 Simulation Results

To verify the observer derived in Theorem 5.1, we conducted a simulation of a vehicle equipped with a single monocular camera, observing 4 point-type landmarks and 2 bearing-type landmarks as it moves through space. The vehicle moves in a circular trajectory at a fixed height of 3 m. The body-fixed velocity $U$ is fixed to be constant, with $\Omega_{U}=(0,0,-0.5)^{\top}$ rad/s and $V_{U}=(1.5,0,0)$ m/s. For simplicity, the camera frame is assumed to coincide with the body-fixed frame of the vehicle, which avoids the need for a separate computation to transform the body-fixed velocity into the camera frame. Let the true state be $(P,\eta_{i})\in\mathcal{T}^{\circ}_{n_{p},n_{b}}(3)$ . The reference configuration is chosen as $\xi^{\circ}=(I_{4},\eta^{\circ}_{i})$ , where

[TABLE]

where the $\epsilon_{i}$ terms represent errors in the initial measurements. The observer is defined on $\mathbf{VSLAM}_{n}(3)$ , with kinematics given by (5.1) and innovation terms given by Theorem 5.1. The initial conditions and gains for the observer are chosen as

[TABLE]

The simulation was carried out by implementing the continuous time system with Euler integration using a time step of $dt=0.02$ s.

Figure 1a shows the evolution of $\log_{10}(\mathcal{L})$ , where $\mathcal{L}$ is the Lyapunov function of the simulated system as defined in (24). This clearly shows exponential convergence of the observer error dynamics. Figure 1b shows the evolution of the trajectory of the simulated system. Since the estimated state only converges to the true state up to equivalence on the SLAM manifold $\mathcal{M}_{n}(3)$ , it is necessary to assign total space coordinates to the estimate to aid the comparison. In Figure 1b the choice of total space coordinates for the estimated state is made so that the final robot pose is aligned with that of the true state. This shows that the landmarks have correctly converged to the true landmarks up to the SLAM manifold equivalence.

7 Conclusion

This paper presents an observer design posed on a novel symmetry group for the visual SLAM problem. The total space and SLAM manifold conceptualised in [18] have been extended to include free vectors. The development of the symmetry group $\mathbf{VSLAM}_{n}(3)$ has allowed both point-type and bearing-type landmarks to be treated in a unified framework. Riccati observers were incorporated for each of the point-type landmarks, and grant the user refined control over their convergence. The almost-global convergence of the proposed observer on both point-type and bearing-type landmarks is a contrast to many state-of-the-art Extended Kalman Filter systems, which suffer from linearisation errors. While research into the development of non-linear observers for the SLAM problem is only recent, the observer for visual SLAM presented in this paper demonstrates some of the key advantages the approach can offer.

Acknowledgment

This research was supported by the Australian Research Council through the “Australian Centre of Excellence for Robotic Vision” CE140100016.

Bibliography21

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds . Princeton University Press, Princeton, NJ, USA, January 2008.
2[2] G. Baldwin, R. Mahony, and J. Trumpf. A nonlinear observer for 6 DOF pose estimation from inertial and bearing measurements. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) , pages 2237–2242, 2009.
3[3] Axel Barrau and Silvere Bonnabel. An EKF-SLAM algorithm with consistency properties, 2016. ar Xiv:1510.06263.
4[4] E. Bjorne, T. A. Johansen, and E. F. Brekke. Redesign and analysis of globally asymptotically stable bearing only SLAM. In 2017 20th International Conference on Information Fusion (Fusion) , pages 1–8, July 2017.
5[5] S. Bonnabel, P. Martin, and P. Rouchon. Symmetry-preserving observers. IEEE Transactions on Automatic Control , 53(11):2514–2526, 2008.
6[6] F. Le Bras, T. Hamel, R. Mahony, and C. Samson. Observers for position and velocity bias estimation from single or multiple direction outputs. In T.I. Fossen, K.Y. Pettersen, and H. Nijmeijer, editors, Sensing and Control for Autonomous Vehicles , chapter 1. Lecture Notes in Control and Information Sciences 474, Springer, 2017.
7[7] Cesar Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, JosÂ´e Neira, Ian D. Reid, and John J. Leonard. Past, present, and future of simultaneous localization and mapping: Towards the robust-perception age. IEEE Transactions on Robotics , 32(6):1309–1332, December 2016.
8[8] J. Delmerico and D. Scaramuzza. A benchmark comparison of monocular visual-inertial odometry algorithms for flying robots. In IEEE International Conference on Robotics and Automation (ICRA) , 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A Geometric Observer Design for Visual Localisation and Mapping

Abstract

1 Introduction

2 Preliminaries

2.1 Notation

2.2 Real Projective Space

3 Problem Formulation

3.1 VSLAM Total Space

3.2 VSLAM Kinematics

3.3 System Output

4 Symmetry of the VSLAM Problem

4.1 Symmetry of the Total Space

Lemma 4.1**.**

Proof.

Lemma 4.2**.**

Proof.

4.2 Lift of the VSLAM Kinematics

Lemma 4.3**.**

Proof.

5 Observer Design

5.1 Observer Kinematics

5.2 Landmark Observer

Theorem 5.1**.**

Proof.

6 Simulation Results

7 Conclusion

Acknowledgment

Lemma 4.1.

Lemma 4.2.

Lemma 4.3.

Theorem 5.1.