Consistent Inversion of Noisy Non-Abelian X-Ray Transforms

Fran\c{c}ois Monard; Richard Nickl; Gabriel P. Paternain

arXiv:1905.00860·math.AP·June 2, 2020

Consistent Inversion of Noisy Non-Abelian X-Ray Transforms

Fran\c{c}ois Monard, Richard Nickl, Gabriel P. Paternain

PDF

TL;DR

This paper introduces a Bayesian approach using Gaussian processes to invert noisy non-Abelian X-ray transforms on simple surfaces, achieving convergence rates and stability estimates for recovering matrix fields.

Contribution

It develops a novel statistical algorithm for the inverse problem of non-Abelian X-ray transforms, with proven convergence rates and a new stability estimate.

Findings

01

Convergence rate of the statistical error is algebraic in 1/N.

02

Error approaches 1/√N for smooth matrix fields.

03

Stability estimate for the inverse map is established.

Abstract

For $M$ a simple surface, the non-linear statistical inverse problem of recovering a matrix field $Φ : M \to so (n)$ from discrete, noisy measurements of the $S O (n)$ -valued scattering data $C_{Φ}$ of a solution of a matrix ODE is considered ( $n \geq 2$ ). Injectivity of the map $Φ \mapsto C_{Φ}$ was established by [Paternain, Salo, Uhlmann; Geom.Funct.Anal. 2012]. A statistical algorithm for the solution of this inverse problem based on Gaussian process priors is proposed, and it is shown how it can be implemented by infinite-dimensional MCMC methods. It is further shown that as the number $N$ of measurements of point-evaluations of $C_{Φ}$ increases, the statistical error in the recovery of $Φ$ converges to zero in $L^{2} (M)$ -distance at a rate that is algebraic in $1/ N$ , and approaches $1/ N$ for smooth matrix fields $Φ$ . The proof relies, among other…

Equations558

\partial_{+} S M = {(x, v) \in T M, x \in \partial M, g_{x} (v, v) = 1, ⟨ v, ν_{x} ⟩_{g} \leq 0},

\partial_{+} S M = {(x, v) \in T M, x \in \partial M, g_{x} (v, v) = 1, ⟨ v, ν_{x} ⟩_{g} \leq 0},

\dot{U} + Φ (γ (t)) U = 0, U (T) = id .

\dot{U} + Φ (γ (t)) U = 0, U (T) = id .

C_{Φ} : \partial_{+} S M \to C^{n \times n},

C_{Φ} : \partial_{+} S M \to C^{n \times n},

S O (n) \subset S U (n) \subset U (n) .

S O (n) \subset S U (n) \subset U (n) .

∥ \overset{ˉ}{Φ} (D_{N}) - Φ_{0} ∥_{L^{2} (M)} \to 0.

∥ \overset{ˉ}{Φ} (D_{N}) - Φ_{0} ∥_{L^{2} (M)} \to 0.

Φ = 0 - B_{3} B_{2} B_{3} 0 - B_{1} - B_{2} B_{1} 0

Φ = 0 - B_{3} B_{2} B_{3} 0 - B_{1} - B_{2} B_{1} 0

I = I_{0} A \frac{1}{2} (1 + cos φ),

I = I_{0} A \frac{1}{2} (1 + cos φ),

I^{'} = I_{0} A \frac{1}{2} (1 - cos φ),

I^{'} = I_{0} A \frac{1}{2} (1 - cos φ),

cos φ = \frac{I - I ^{'}}{I + I ^{'}}

cos φ = \frac{I - I ^{'}}{I + I ^{'}}

Φ : M \to g

Φ : M \to g

C_{Φ} : \partial_{+} S M \to G .

C_{Φ} : \partial_{+} S M \to G .

(X_{i}, V_{i})_{i = 1}^{N} \sim^{i . i . d .} λ on \partial_{+} S M .

(X_{i}, V_{i})_{i = 1}^{N} \sim^{i . i . d .} λ on \partial_{+} S M .

(ε_{i, j, k} : 1 \leq j, k \leq n)_{i = 1}^{N} be i.i.d. N (0, σ^{2}), σ > 0,

(ε_{i, j, k} : 1 \leq j, k \leq n)_{i = 1}^{N} be i.i.d. N (0, σ^{2}), σ > 0,

Y_{i} = (Y_{i, j, k}), Y_{i, j, k} = C_{Φ} ((X_{i}, V_{i}))_{j, k} + ε_{i, j, k}, i = 1, \dots, N; 1 \leq j, k \leq n .

Y_{i} = (Y_{i, j, k}), Y_{i, j, k} = C_{Φ} ((X_{i}, V_{i}))_{j, k} + ε_{i, j, k}, i = 1, \dots, N; 1 \leq j, k \leq n .

D_{N} = {Y_{1}, \dots, Y_{N}, (X_{1}, V_{1}), \dots, (X_{N}, V_{N})}

D_{N} = {Y_{1}, \dots, Y_{N}, (X_{1}, V_{1}), \dots, (X_{N}, V_{N})}

S M = {(x, v) \in T M, g_{x} (v, v) = 1} .

S M = {(x, v) \in T M, g_{x} (v, v) = 1} .

\partial_{\pm} S M := {(x, v) \in \partial S M : \pm ⟨ v, ν_{x} ⟩_{g} \leq 0} .

\partial_{\pm} S M := {(x, v) \in \partial S M : \pm ⟨ v, ν_{x} ⟩_{g} \leq 0} .

λ \equiv \frac{1}{A r e a ( \partial _{+} S M )} d Σ^{2} ∣_{\partial_{+} S M} .

λ \equiv \frac{1}{A r e a ( \partial _{+} S M )} d Σ^{2} ∣_{\partial_{+} S M} .

∥ U ∥_{L^{2}}^{2} := \int_{N} ∣ U ∣_{F}^{2} d Vol_{h}, ∥ U ∥_{L^{\infty}} := y \in N sup ∣ U (y) ∣_{F} .

∥ U ∥_{L^{2}}^{2} := \int_{N} ∣ U ∣_{F}^{2} d Vol_{h}, ∥ U ∥_{L^{\infty}} := y \in N sup ∣ U (y) ∣_{F} .

∥ U ∥_{C^{β}} = ∣ α ∣ \leq ⌊ β ⌋ \sum y \in N sup ∣ T^{α} U (y) ∣_{F} + ∣ α ∣ = ⌊ β ⌋ \sum x \neq = y \in N sup \frac{∣ T ^{α} U ( x ) - T ^{α} U ( y ) ∣ _{F}}{d _{h} ( x , y ) ^{β - ⌊ β ⌋}},

∥ U ∥_{C^{β}} = ∣ α ∣ \leq ⌊ β ⌋ \sum y \in N sup ∣ T^{α} U (y) ∣_{F} + ∣ α ∣ = ⌊ β ⌋ \sum x \neq = y \in N sup \frac{∣ T ^{α} U ( x ) - T ^{α} U ( y ) ∣ _{F}}{d _{h} ( x , y ) ^{β - ⌊ β ⌋}},

∥ U ∥_{H^{s}}^{2} = ∣ α ∣ \leq s \sum ∥ T^{α} U ∥_{L^{2}}^{2},

∥ U ∥_{H^{s}}^{2} = ∣ α ∣ \leq s \sum ∥ T^{α} U ∥_{L^{2}}^{2},

∥ Φ - Ψ ∥_{L^{2} (M)} \leq c (Φ, Ψ) ∥ C_{Φ} C_{Ψ}^{- 1} - id ∥_{H^{1} (\partial_{+} S M)},

∥ Φ - Ψ ∥_{L^{2} (M)} \leq c (Φ, Ψ) ∥ C_{Φ} C_{Ψ}^{- 1} - id ∥_{H^{1} (\partial_{+} S M)},

c (Φ, Ψ) = C_{1} (1 + (∥ Φ ∥_{C^{1}} \lor ∥ Ψ ∥_{C^{1}})) e^{C_{2} (∥ Φ ∥_{C^{1}} \lor ∥ Ψ ∥_{C^{1}})},

c (Φ, Ψ) = C_{1} (1 + (∥ Φ ∥_{C^{1}} \lor ∥ Ψ ∥_{C^{1}})) e^{C_{2} (∥ Φ ∥_{C^{1}} \lor ∥ Ψ ∥_{C^{1}})},

C_{Φ} C_{Ψ}^{- 1} = id + I_{Θ (Φ, Ψ)} (Φ - Ψ)

C_{Φ} C_{Ψ}^{- 1} = id + I_{Θ (Φ, Ψ)} (Φ - Ψ)

∥ Φ - Ψ ∥_{L^{2} (M)} \leq c (Φ, Ψ) ∥ I_{Θ (Φ, Ψ)} (Ψ - Φ) ∥_{H^{1} (\partial_{+} S M)} .

∥ Φ - Ψ ∥_{L^{2} (M)} \leq c (Φ, Ψ) ∥ I_{Θ (Φ, Ψ)} (Ψ - Φ) ∥_{H^{1} (\partial_{+} S M)} .

∥ C_{Φ} - C_{Ψ} ∥_{H^{k} (\partial_{+} S M, C^{n \times n})}

∥ C_{Φ} - C_{Ψ} ∥_{H^{k} (\partial_{+} S M, C^{n \times n})}

∥ C_{Φ} - C_{Ψ} ∥_{C^{k} (\partial_{+} S M, C^{n \times n})}

∥Φ - Ψ ∥_{L^{2} (M)} \leq C^{'} c (Φ, Ψ) (1 + ∥ Ψ ∥_{C^{1}}) ∥ C_{Φ} - C_{Ψ} ∥_{H^{1} (\partial_{+} S M)},

∥Φ - Ψ ∥_{L^{2} (M)} \leq C^{'} c (Φ, Ψ) (1 + ∥ Ψ ∥_{C^{1}}) ∥ C_{Φ} - C_{Ψ} ∥_{H^{1} (\partial_{+} S M)},

\overset{n}{ˉ} = \frac{n ( n - 1 )}{2} = \mbox d im (s o (n)) .

\overset{n}{ˉ} = \frac{n ( n - 1 )}{2} = \mbox d im (s o (n)) .

Φ (x) = 0 - B_{3} (x) B_{2} (x) B_{3} (x) 0 - B_{1} (x) - B_{2} (x) B_{1} (x) 0, x \in M .

Φ (x) = 0 - B_{3} (x) B_{2} (x) B_{3} (x) 0 - B_{1} (x) - B_{2} (x) B_{1} (x) 0, x \in M .

(Y_{i}, (X_{i}, V_{i}))_{i = 1}^{N} ∣Φ \sim P_{Φ}^{N} on (R^{n \times n} \times \partial_{+} S M)^{N}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\startingpage

1 \authorheadlineF. Monard, R. Nickl, G.P. Paternain \titleheadlineConsistent Inversion of Noisy Non-Abelian X-Ray Transforms

Department of Mathematics, University of California, Santa Cruz, CA 95064 Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge CB3 0WB, UK Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge CB3 0WB, UK

Consistent Inversion of Noisy Non-Abelian X-Ray Transforms

François Monard

Richard Nickl

Gabriel P. Paternain

(Month 200X)

Abstract

For $M$ a simple surface, the non-linear statistical inverse problem of recovering a matrix field $\Phi:M\to{\mathfrak{s}}{\mathfrak{o}}(n)$ from discrete, noisy measurements of the $SO(n)$ -valued scattering data $C_{\Phi}$ of a solution of a matrix ODE is considered ( $n\geq 2$ ). Injectivity of the map $\Phi\mapsto C_{\Phi}$ was established by [Paternain, Salo, Uhlmann; Geom. Funct. Anal. 2012, [35]].

A statistical algorithm for the solution of this inverse problem based on Gaussian process priors is proposed, and it is shown how it can be implemented by infinite-dimensional MCMC methods. It is further shown that as the number $N$ of measurements of point-evaluations of $C_{\Phi}$ increases, the statistical error in the recovery of $\Phi$ converges to zero in $L^{2}(M)$ -distance at a rate that is algebraic in $1/N$ , and approaches $1/\sqrt{N}$ for smooth matrix fields $\Phi$ . The proof relies, among other things, on a new stability estimate for the inverse map $C_{\Phi}\to\Phi$ .

Key applications of our results are discussed in the case $n=3$ to polarimetric neutron tomography, see [Desai et al., Nature Sc. Rep. 2018, [12]] and [Hilger et al., Nature Comm. 2018, [23]].

††volume: 000

1 Introduction
1.1 Non-Abelian $X$ -ray transforms
1.2 Polarimetric neutron tomography (PNT)
1.3 The statistical observation scheme
1.4 Some geometric background and basic notation
2 Theoretical results for the deterministic inverse problem
3 Bayesian inversion of non-Abelian $X$ -ray transforms
3.1 Main results
3.2 Remarks and discussion
4 Implementation of the algorithm
4.1 Numerical domain and forward operator
4.2 Statistical estimation through MCMC
5 Proofs
5.1 Geometric preliminaries
5.2 Forward estimates - proof of Theorem 2.2
5.3 Stability estimate - proof of Theorem 2.1
5.4 Consistency of the posterior mean: proof of Theorem 3.2

1 Introduction

1.1 Non-Abelian $X$ -ray transforms

Our object of study is the non-abelian $X$ -ray transform, a mapping from a matrix-valued field $\Phi$ defined on a Riemannian surface with boundary $(M,g,\partial M)$ , to its scattering data $C_{\Phi}$ , defined at the influx boundary $\partial_{+}SM$ of $M$ , given by

[TABLE]

where $TM$ is the tangent bundle of $M$ , and $\nu_{x}$ denotes the outward unit normal at $x\in\partial M$ .

We will assume that the surface $M$ is simple in the sense that it is (topologically) a disk, it has no conjugate points, and a strictly convex boundary. Strictly convex domains in the plane (and small perturbations of them) are examples of simple surfaces. In this context, all unit-speed geodesics111Unit-speed geodesics are locally defined dynamically through the equation $\nabla_{\dot{\gamma}}\dot{\gamma}=0$ with $\nabla$ the Levi-Civita connection, and satisfying $g_{\gamma(t)}(\dot{\gamma}(t),\dot{\gamma}(t))=1$ for all $t$ where $\gamma(t)$ is defined. in $M$ exit $M$ in finite time. This fact allows us to identify $\partial_{+}SM$ with the space of geodesics on $M$ , by associating to any $(x,v)\in\partial_{+}SM$ the unique geodesic $\gamma$ passing through $(x,v)$ .

Let $\Phi:M\to{\mathbb{C}}^{n\times n}$ be a smooth map. Given a unit-speed geodesic $\gamma:[0,T]\to M$ with endpoints $\gamma(0),\gamma(T)\in\partial M$ , we may define the scattering data of $\Phi$ on $\gamma$ to be $C_{\Phi}(\gamma):=U(0)$ , where $U:[0,T]\to{\mathbb{C}}^{n\times n}$ satisfies the linear system of ODE’s

[TABLE]

This problem, backward in time for convention here, is well-posed and leads to a unique definition of $U(0)$ , containing cumulated information about $\Phi$ along the geodesic $\gamma$ . Note that when $\Phi$ is scalar, we obtain $\log U(0)=\int_{0}^{T}\Phi(\gamma(t))\ dt$ , which is the classical X-ray/Radon transform of $\Phi$ along the curve $\gamma$ . Considering the collection of all such data makes up the scattering data (or non-Abelian X-ray transform) of $\Phi$ , viewed here as a map

[TABLE]

and we are concerned with the problem of recovering $\Phi$ from $C_{\Phi}$ . Inverting Abelian and non-Abelian X-ray transforms are examples of inverse problems in integral geometry, an active field permeating several tomographic imaging methods, see e.g. the recent topical review [24].

The problem of inverting the non-linear mapping $\Phi\mapsto C_{\Phi}$ in this generality has been recently solved in [36]. Previous injectivity results were obtained, either by adding curvature conditions on the manifold, or by fixing a Lie group $G$ (realised as matrices, for simplicity) and its Lie algebra $\mathfrak{g}$ , in turn asking whether a $\mathfrak{g}$ -valued field $\Phi$ can be recovered from its $G$ -valued scattering data $C_{\Phi}$ . In this paper, we will mainly use the Lie groups $SO(n)=\{U\in\mathbb{R}^{n\times n},\ U^{T}U=\text{id},\;\det U=1\}$ , $U(n)=\{U\in{\mathbb{C}}^{n\times n},\ U^{*}U=\text{id}\}$ and $SU(n)=U(n)\cap\{\det=1\}$ , and their Lie algebras ${\mathfrak{s}}{\mathfrak{o}}(n)=\{A\in\mathbb{R}^{n\times n},\ A^{T}+A=0\}$ , $\mathfrak{u}(n)=\{A\in{\mathbb{C}}^{n\times n},\ A^{*}+A=0\}$ and $\mathfrak{su}(n)=\mathfrak{u}(n)\cap\{\text{tr}=0\}$ . Above, ’ $T$ ’, ’ $*$ ’, ’ $\det$ ’ and ’tr’ refer to matrix ’transpose’, ’conjugate transpose’, ’determinant’ and ’trace’, respectively. Note the inclusions

[TABLE]

The state of the art on this question can be written as follows:

Theorem 1.1.

Let $(M,g)$ be a simple surface. The map $\Phi\mapsto C_{\Phi}$ is injective in the following cases:

(a) $G=U(n)$ [35];

(b) $G=GL(n,\mathbb{C})$ [36].

The proof of (b) consists of a reduction to the unitary case in (a) via a factorization theorem in Loop Groups. Earlier injectivity results have been obtained by several authors, cf. [15, 33, 34] and references therein, particularly when $(M,g)$ is a domain in the Euclidean plane.

The absence of concrete reconstruction formulas for the inverse map $C_{\Phi}\to\Phi$ when $n\geq 2$ , and the challenge of dealing with physical experiments such as those arising in polarimetric neutron tomography (see Section 1.2), where $N$ discrete and noisy measurements $D_{N}\sim P_{\Phi}^{N}$ of $C_{\Phi}\in SO(3)$ are made (see Section 1.3 for details), motivate the main contribution of this article, which is to present a statistical algorithm $\bar{\Phi}(D_{N})$ that allows to recover $\Phi$ . The implementation of $\bar{\Phi}(D_{N})$ is detailed in Section 4, and our main theoretical result is the statistical analogue of the injectivity result Theorem 1.1, namely the frequentist consistency of reconstruction in the large sample limit, which somewhat informally can be stated as follows:

Theorem 1.2.

Suppose the data $D_{N}$ is generated from the probability distribution $P_{\Phi_{0}}^{N}$ where $\Phi_{0}:M\to\mathfrak{so}(n)$ is any smooth matrix field $\Phi_{0}$ . Then we have that, as sample size $N\to\infty$ , and in $P^{N}_{\Phi_{0}}$ -probability,

[TABLE]

See Theorem 3.2 in Section 3 for a fully rigorous statement of this result, which in fact requires significantly weaker hypotheses on $\Phi_{0}$ , and also specifies an explicit ‘algebraic’ rate of convergence $N^{-\eta}$ in the last limit.

The proof of the previous theorem relies on ideas from Bayesian nonparametric statistics [44, 17] and on new ‘quantitative versions’ of the injectivity result in Theorem 1.1 which are of independent interest and stated in Section 2.

1.2 Polarimetric neutron tomography (PNT)

The basic problem in PNT consists in finding a magnetic field from spin measurements of neutrons [26, 11, 12, 23]. In this case the explicit relation is

[TABLE]

where $B=(B_{1},B_{2},B_{3})$ is the magnetic field. In the case of PNT one assumes that the underlying surface $M$ is just the disc in the plane (by slicing with 2D discs one can solve the 3D problem).

The details of the experiment of polarimetric neutron tomography may be found, e.g., in [12]. Here we give a description that is suitable for our purposes. The data produced by the experiment is the orthogonal matrix $C^{-1}_{\Phi}(x,v)=C^{T}_{\Phi}(x,v)\in SO(3)$ , where $C_{\Phi}(x,v)$ is the scattering data described above. The significance of this in terms of spin, is a follows: if a neutron travelling along the ray determined by $(x,v)$ enters the magnetic field with a spin $s_{in}\in\mathbb{S}^{2}$ ( $\mathbb{S}^{2}$ denotes the Euclidean unit sphere in $\mathbb{R}^{3}$ ), it exits the field with spin $s_{out}=C^{-1}_{\Phi}(x,v)s_{in}\in\mathbb{S}^{2}$ (for an ensemble of polarized neutrons in a magnetic field it can be shown that they behave like a particle with a classical magnetic moment). The magnetic field $B$ is defined in 3D space, but the experiment makes measurements on a 2D plane and produces a global reconstruction by slicing. The geometry of the experiment is thus a 2D parallel beam geometry which is easily converted into fan-beam geometry as considered above. The question is then how to manipulate the spin to produce the orthogonal matrix. This is done with an ingenious sequence of spin flippers and rotators placed before and after the magnetic field being measured. The material containing the magnetic field can also be rotated so as to produce parallel beams from different angles. After the spin has been manipulated it goes through an analyser; this device is essentially a spin filter that only lets those neutrons with vertically aligned spin go through. The neutron count is then measured with a detector that produces an intensity reading. The spin of the entering beam is perfectly aligned with the spin of the analyser, so that the intensity measurement is actually a measurement of the angle of rotation of the spin due to the magnetic field. The key relation is given by [26, Equation 1]

[TABLE]

where $A$ is the attenuation of the medium, $I_{0}$ is the intensity of the incoming beam and $\varphi$ is the angle by which the spin has rotated.

The use of the spin flipper allows the measurement of

[TABLE]

and from this one deduces that

[TABLE]

which then becomes an entry of our matrix $C^{T}_{\Phi}(x,v)$ . By rotating by $\pi/2$ and flipping (rotation by $\pi$ ) one can thus produce the entire orthogonal matrix as data. In other words, if $\{e_{1},e_{2},e_{3}\}$ is the canonical basis of 3-space, $\cos\varphi$ gives $C^{T}_{\Phi}(x,v)e_{i}\cdotp e_{j}$ for all $i,j$ and hence all the entries. In some situations, where the attenuation of the medium is known, the use of spin flippers is not necessary and can be calibrated out. Assuming an additive Gaussian noise in the intensities $I$ , equation (2) approximately produces an additive Gaussian noise in the entries of the matrix $C_{\Phi}$ which is precisely the noise model we adopt below.

As in the articles [12, 13] our approach reconstructs 3D magnetic fields of arbitrary direction and distribution. This provides a method able to investigate samples without imposing any a priori knowledge of the magnetic field orientation, and requires understanding of the full non-linear inverse problem. The recent preprint [13] introduces a modified Newton-Kantorovich type algorithm for the solution of the non-linear problem, a Newton-type algorithm where the inversion of the Jacobian at each iteration only uses the differential of the map $\Phi\mapsto C_{\Phi}$ at the base point $\Phi_{0}\equiv 0$ .

As pointed out in [13], the algorithm appears to work well for small enough fields (or large enough velocities of neutrons), but may fail due to “phase wrapping” when the field is large enough. Our approach does not exhibit this problem.

1.3 The statistical observation scheme

Consider a simple surface $M$ as above with influx boundary $\partial_{+}SM$ , and a matrix valued map

[TABLE]

and scattering data

[TABLE]

Here we take $G=SO(n)$ for some $n\geq 2$ , with corresponding Lie algebra $\mathfrak{g}={\mathfrak{s}}{\mathfrak{o}}(n)$ , the set of skew-symmetric matrices. Recall that in the key application to PNT from the previous subsection, $M$ is the flat disk and $n=3$ . We could take $G=SU(n)$ and $\mathfrak{g}=\mathfrak{su}(n)$ just as well, but for sake of conciseness prefer to avoid a complex-valued statistical noise model in what follows.

To describe the statistical observation setting, let $\lambda$ be the uniform distribution (volume element) on $\partial_{+}SM$ (see (5) below for a precise definition), and consider ‘design’ random variables

[TABLE]

These draws represent a randomised choice of the geodesics for which experiments are performed – they have to be ‘equally spaced’ throughout ‘geodesic space’ $\partial_{+}SM$ in a statistical sense. For each resulting measurement of $C_{\Phi}((X_{i},V_{i}))$ the statistical observational error arising in the experiment is modelled by independent Gaussian matrix noise. More precisely let

[TABLE]

random variables that are independent of the $(X_{i},V_{i})$ ’s, and let $\mathcal{E}_{i}=(\varepsilon_{i,j,k})$ be the random $n\times n$ noise matrix which adds a Gaussian noise variable in each matrix entry to $C_{\Phi}((X_{i},V_{i}))$ . Our observations then consist of the sequence of $N$ random $n\times n$ matrices

[TABLE]

The variables $Y_{i,j,k}$ are all independent, and even i.i.d. for $j,k$ fixed. Conditionally on $(X_{i},V_{i})=(x_{i},v_{i})$ they are multivariate normal random variables with diagonal covariance and (vectorised) mean $C_{\phi}(x_{i},v_{i})_{j,k}$ . Note that while $C_{\Phi}(x,y)$ takes values in $SO(n)$ , the $Y_{i}$ are not in $SO(n)$ (or even $U(n)$ ) as we have not constrained $\mathcal{E}_{i}$ at all – this is in line with the physical experiments for PNT described in Section 1.2 where statistical errors arise from noisy measurements of each matrix entry of $C_{\Phi}(x,v)$ . For the theory we will assume that the noise variance $\sigma^{2}>0$ is fixed and known – in practice it can be replaced by the estimated sample variance of the $Y_{i,j,k}$ ’s.

To fix notation: The joint law of the random variables $(Y_{i},(X_{i},V_{i}))_{i=1}^{N}$ in (3) on $(\mathbb{R}^{n\times n}\times\partial_{+}SM)^{N}$ will be denoted by $P_{\Phi}^{N}=\times_{i=1}^{N}P_{\Phi}^{i}$ , where we note $P_{\Phi}^{i}=P_{\Phi}^{1}$ for all $i$ . We also write $P^{N}_{\varepsilon}$ for the law of the $(\mathcal{E}_{i})_{i=1}^{N}$ ’s, $\lambda^{N}$ for the law of the $(X_{i},V_{i})_{i=1}^{N}$ and

[TABLE]

for the full data vector. The corresponding expectation operators are obtained by replacing ‘ $P$ ’ by ‘ $E$ ’ in the preceding expressions. The dependence on $\sigma^{2}$ will be suppressed in the notation.

1.4 Some geometric background and basic notation

We conclude this section by introducing some more basic notation that will be used throughout.

Our background geometry is a simple surface with boundary $(M,g,\partial M)$ . By ’simple’, we mean (i) $M$ is non-trapping (in the sense that every maximal geodesic in $M$ has finite length), (ii) $M$ has no conjugate points and (iii) $\partial M$ is strictly convex (i.e. $\partial M$ has positive definite second fundamental form). We denote by $SM$ the unit tangent bundle of $M$ , namely

[TABLE]

Its boundary $\partial SM:=\{(x,v)\in SM:\;x\in\partial M\}$ can be split into ’influx’ and ’outflux’ boundary, depending on whether the tangent vector points inside or outside, namely we define, for $\nu_{x}$ is the outer unit normal at $x\in\partial M$ ,

[TABLE]

The manifolds $M$ , $\partial M$ , $SM$ and $\partial_{+}SM$ all carry natural volume elements, allowing us to define $L^{2}$ spaces below. Specifically, the Riemannian metric $g$ induces an area form $dx$ on $M$ and restricts to a metric on $\partial M$ . The unit sphere bundle $SM$ carries the volume element $d\Sigma^{3}=dx\,dv$ where $dv$ is the length element in the unit circle $S_{x}\subset T_{x}M$ . Finally the boundary $\partial SM$ of $SM$ carries the area form $d\Sigma^{2}=ds\ dv$ where $dv$ is as above and $ds$ is the arclength (w.r.t. the metric $g$ ) along the boundary. Its restriction to $\partial_{+}SM$ will be denoted by

[TABLE]

The spaces ${\mathbb{C}}^{n}$ and ${\mathbb{C}}^{n\times n}$ will be equipped with the canonical Hermitian inner product $\langle\cdot,\cdot\rangle$ and induced norm $|\cdot|$ . For elements in ${\mathbb{C}}^{n\times n}$ , this corresponds to the Frobenius norm $\left|A\right|_{F}^{2}:=\text{tr}(A^{*}A)=\sum_{i,j=1}^{n}|A_{i,j}|^{2}$ , which is $U(n)$ -invariant in the sense that for any $U\in U(n)$ and $A$ arbitrary, $\left|AU\right|_{F}=\left|UA\right|_{F}=\left|A\right|_{F}$ .

Given $(N,h)$ a $d$ -dimensional Riemannian manifold (either $M$ , $\partial M$ , $SM$ , $\partial_{+}SM$ , or $\partial SM$ as explained above), one may adapt the usual function spaces to ${\mathbb{C}}^{n}$ - or ${\mathbb{C}}^{n\times n}$ -valued functions as follows: $L^{2}(N,{\mathbb{C}}^{n\times n})$ , $L^{\infty}(N,{\mathbb{C}}^{n\times n})$ with norms

[TABLE]

One may differentiate functions using partial derivatives $\{\partial_{y_{j}}\}_{j=1}^{d}$ in coordinate charts, or equivalently, using $\{T_{j}\}_{j=1}^{d}$ a global basis of smooth vector fields on $N$ which pairwise commutes (it will be useful to adopt the latter viewpoint in later sections). Given a $d$ -index ${\boldsymbol{\alpha}}=(\alpha_{1},\dots,\alpha_{d})$ , one may define $|{\boldsymbol{\alpha}}|=\alpha_{1}+\dots+\alpha_{d}$ and $T^{{\boldsymbol{\alpha}}}=T_{1}^{\alpha_{1}}\cdots T_{d}^{\alpha_{d}}$ . The metric $h$ equips $N$ with a distance function $d_{h}(x,y)$ , and for $\beta\geq 0$ , we can thus define Hölder spaces $C^{\beta}(N,{\mathbb{C}}^{n\times n})$ with norm

[TABLE]

with the second term removed when $\beta$ is an integer. We will also use $L^{2}$ -based Sobolev spaces $H^{s}(N,{\mathbb{C}}^{n\times n})$ with norm

[TABLE]

for $s\in\mathbb{N}$ , and defined by interpolation otherwise (see, e.g., [42, Ch. 4]).

As above, when clear from the context, the domain and/or codomain will be dropped from the notation. In the following sections, spaces of functions with codomain $SO(n)$ , $SU(n)$ or their Lie algebras will make use of the same topology of the corresponding spaces of ${\mathbb{C}}^{n\times n}$ -valued functions. The $c$ -subscript attached to a space of maps defined on $M$ denotes the linear subspace of those maps that vanish identically outside of a compact subset of the interior $M^{int}$ of $M$ .

2 Theoretical results for the deterministic inverse problem

When discrete measurements of the forward data $C_{\Phi}$ are corrupted by statistical noise, the injectivity result Theorem 1.1 is not useful to reconstruct $\Phi$ from the observations, and we will discuss in the next section how to develop statistical methods that consistently solve this statistical inverse problem. The proofs that substantiate these methods are based on quantitative versions of Theorem 1.1 – stability estimates – as well as continuity properties of the forward map, and we describe in this section the analytical results we obtain.

The results to follow hold when the codomain of the matrix fields is the largest of the three compact Lie groups introduced before Theorem 1.1, namely $U(n)$ (with Lie algebra ${\mathfrak{u}}(n)$ ), see Eq. (1).

Theorem 2.1.

Let $(M,g)$ be a simple surface. Given two matrix fields $\Phi$ and $\Psi$ in $C^{1}(M,{\mathfrak{u}}(n))$ there exists a constant $c(\Phi,\Psi)$ such that

[TABLE]

where $c(\Phi,\Psi)$ is a continuous function of $\|\Phi\|_{C^{1}}\vee\|\Psi\|_{C^{1}}$ , explicitly

[TABLE]

and where the constants $C_{1},C_{2}$ only depend on $(M,g)$ .

The proof of Theorem 2.1 initially follows the approach for obtaining $L^{2}\to H^{1}$ stability estimates for the geodesic X-ray transform $I$ as presented in [40, Theorem 3.4.3]. Our starting point is the pseudo-linearisation formula

[TABLE]

where $I_{\Theta(\Phi,\Psi)}$ is a geodesic X-ray transform with suitable weights, see Lemma 5.5. To prove Theorem 2.1 it suffices to show that

[TABLE]

To this end, we use the energy identity (Pestov Identity) developed in [35] for matrix weights arising for connections and matrix fields. The presence of the weights produces additional terms in the identity that need to be controlled to obain the estimate above and this is where most of the work lies. The main idea for controlling them comes from [35] where a connection with the right curvature is artificially introduced to control these terms. The connection is later removed by using (scalar) holomorphic integrating factors whose existence is guaranteed by the microlocal properties of the normal operator associated to the geodesic X-ray transform acting on functions. Taming these integrating factors has a cost which is reflected in the constant $c(\Phi,\Psi)$ given in (6).

For the proof of Theorem 3.2 below we also require ‘forward’ estimates in Sobolev and Hölder scales. These are less sophisticated in nature than the stability estimate above, and hold under less restrictive assumptions. Recall that $(M,g)$ is said to be non-trapping if there is no geodesic with infinite length (any simple manifold is non-trapping).

Theorem 2.2.

Let $(M,g)$ be a non-trapping surface with strictly convex boundary. For any integer $k\geq 0$ and for every $\Phi,\Psi\in C^{k}(M,\mathfrak{u}(n))$ , the following continuity estimates hold:

[TABLE]

where by $\lesssim$ we mean that the inequality holds with some constant that only depends on $M$ , $g$ and $k$ .

In fact in the proof of Theorem 3.2 we shall use instead of Theorem 2.1 the following corollary of the previous two results:

Corollary 2.3.

Under the same hypotheses as in Theorem 2.1 and $c(\Phi,\Psi)$ as in (6), then

[TABLE]

where $C^{\prime}$ is independent of $\Phi$ or $\Psi$ .

3 Bayesian inversion of non-Abelian $X$ -ray transforms

3.1 Main results

The main goal of this section is to introduce a method to infer the matrix field $\Phi\in C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ from discrete observations $D_{N}$ of the scattering data $C_{\Phi}$ described in Section 1.3. We follow the general paradigm of Bayesian inverse problems advocated by A. Stuart [41, 10] which is also related to the paradigm of Bayesian numerical analysis [14, 3] in the noiseless case ( $\sigma=0$ ). The idea is to start from a Gaussian process prior $\Pi$ for the parameter $\Phi$ and to use Bayes’ theorem to infer the best posterior guess for $\Phi$ given data $D_{N}$ .

We will state a theorem that shows that the posterior mean fields $\bar{\Phi}_{N}=E^{\Pi}[\Phi|D_{N}]$ corresponding to a flexible class of Lie-algebra valued Gaussian process priors $\Pi$ for $\Phi$ consistently recover the ‘true’ $\Phi_{0}$ in the frequentist large sample limit as $N\to\infty$ , when noisy experiments have been performed under $P^{N}_{\Phi_{0}}$ in the model (3). In fact we will provide a stochastic convergence rate to zero of the recovery error that is algebraic in inverse sample size $1/N$ .

The proof of Theorem 3.2 below provides a template to establish rigorous statistical guarantees for the Bayesian approach to other non-linear inverse problems as well. See Section 5.4 and Remark 3.6 for more discussion.

We emphasise that obtaining probabilistic consistency under $P_{\Phi_{0}}^{N}$ entails approximate uniformity of the design $(X_{i},V_{i})$ and rules out ‘adversarial’ designs. Fixed (non-random) design $(x_{i},v_{i})$ that is sufficiently ‘equally spaced’ throughout $\partial_{+}SM$ could be considered as well in the theory that follows, either via appealing to asymptotic statistical equivalence results in nonparametric regression [37] or by tracking the numerical discretisation error explicitly through all the proofs that follow. For the purposes of the present paper we opt for the random design setting as it allows for a cleaner, unified probabilistic treatment of the measurement process.

To introduce the Bayesian approach more concisely, consider a prior $\Pi$ for a vector field $(B_{1},\dots,B_{\bar{n}})$ by prescribing a Borel probability measure on the space $\times_{j=1}^{\bar{n}}C(M)$ where

[TABLE]

The natural isomorphism between $\times_{j=1}^{\bar{n}}C(M)$ and the space $C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ of continuous functions from $M$ to ${\mathfrak{s}}{\mathfrak{o}}(n)$ in turn generates a prior $\Pi$ for $\Phi$ by forming a ${\mathfrak{s}}{\mathfrak{o}}(n)$ -valued field from the $B_{i}$ ’s. For instance in the case $n=3$ so that also $\bar{n}=3$ , relevant in PNT, we construct $\Pi$ from

[TABLE]

Then we make the Bayesian model assumption that

[TABLE]

which by Bayes’ rule generates a conditional posterior distribution of $\Phi|(Y_{i},(X_{i},V_{i}))_{i=1}^{N}$ on $C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ – it will be denoted by $\Pi(\cdot|(Y_{i},(X_{i},V_{i}))_{i=1}^{N})\equiv\Pi(\cdot|D_{N})$ . The posterior distribution arises from a dominated family of probability measures (see (58) below) and is hence given by

[TABLE]

for any Borel set $A$ in $C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ . Here

[TABLE]

is, up to additive constants, the log-likelihood function of the observations.

While what precedes was not specific to the choice of a particular prior, the main theorem to follow will hold for priors arising from certain ${\mathfrak{s}}{\mathfrak{o}}(n)$ -valued Gaussian processes. These will be constructed from a Gaussian base prior $\Pi^{\prime}$ from which the coordinates $B_{j}$ of $\times_{j=1}^{\bar{n}}C(M)$ will be drawn independently. In fact we will require draws from $\Pi^{\prime}$ to have $\beta$ -Hölder continuous sample paths on $M$ almost surely. We refer, e.g., to [19, Sections 2.1 and 2.6] for the basic definitions of Gaussian measures and processes and their reproducing kernel Hilbert spaces (RKHS).

*Condition 3.1**.*

For $\beta>0$ and $\alpha>\beta+1$ , let $\Pi^{\prime}$ be a centred Gaussian Borel probability measure on the Banach space $C(M)$ that is supported in a separable (measurable) linear subspace of $C^{\beta}(M)$ , and assume its RKHS $(\mathcal{H},\|\cdot\|_{\mathcal{H}})$ is continuously imbedded into the Sobolev space $H^{\alpha}(M)$ .

See Remark 3.4 for concrete examples and constructions of such Gaussian process priors with ‘maximal choice’ $\mathcal{H}=H^{\alpha}(M)$ and arbitrary $\alpha>\beta+1$ .

Now given a random draw $f^{\prime}\sim\Pi^{\prime}$ we define a new random function

[TABLE]

and denote its law in $C(M)$ by $\Pi_{B}=\Pi_{B,N}$ . Then let $B_{1},\dots,B_{\bar{n}}$ be random functions on $M$ drawn as i.i.d. copies from $\Pi_{B}$ , and let the prior $\Pi=\times_{j=1}^{\bar{n}}\Pi_{B}$ for $\Phi$ be the resulting centred Gaussian product probability measure in the space $C(M,{\mathfrak{s}}{\mathfrak{o}}(n))\simeq\times_{j=1}^{\bar{n}}C(M)$ (see (10) for $n=3$ ). Shrinking the prior towards the origin in a $N$ -dependent way as in (13) is crucial in our proofs, see Remark 3.5 for discussion.

The following theorem gives a bound for the convergence rate of the posterior mean

[TABLE]

towards the true field $\Phi_{0}$ in $L^{2}(M)$ -loss, under the law $P_{\Phi_{0}}^{N}$ of the observations. Note that this mean (expected value) is understood in the usual sense of Bochner integrals and hence $\bar{\Phi}$ takes values in $C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ – for fixed data vector $Y_{i},(X_{i},V_{i})$ and since for $C_{\Phi}\in SO(n)$ the norms $\|C_{\Phi}\|_{L^{\infty}}$ are bounded by a fixed constant, this expected value exists almost surely by (11) and a basic application of Fernique’s theorem (see [19, Exercise 2.1.5]). Let us say $\Phi\in\mathcal{H}$ if all matrix entries of $\Phi$ are contained in $\mathcal{H}$ .

Theorem 3.2.

Suppose the Gaussian prior $\Pi$ for $\Phi$ arises as after (13) with base prior $\Pi^{\prime}$ satisfying Condition 3.1 for $\alpha>\beta+1,\beta>2$ . Let $\bar{\Phi}_{N}$ be the mean (14) of the posterior distribution $\Pi(\cdot|(Y_{i},(X_{i},V_{i}))_{i=1}^{N})$ arising from observations (3). Assume $\Phi_{0}\in C^{\alpha}(M,{\mathfrak{s}}{\mathfrak{o}}(n))\cap\mathcal{H}$ . Then we have, for some $\eta>0$

[TABLE]

The proof is given in Section 5.4. We note that the constraint $\beta>2$ (and hence $\alpha>3$ ) could be relaxed to $\beta>1$ (and hence $\alpha>2$ ) at the expense of more technical proofs (see Remark 5.20). We further remark that in the proof we establish in particular that the random posterior measure $\Pi(\cdot|(Y_{i},(X_{i},V_{i}))_{i=1}^{N})$ on $C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ concentrates with probability approaching one in a $N^{-\eta}$ -diameter $L^{2}(M)$ -ball centred at $\Phi_{0}$ , see Theorem 5.19.

3.2 Remarks and discussion

*Remark 3.3**.*

[The exponent $\eta$ .] In the proof (see (81)) we show that

[TABLE]

is permitted in the previous theorem. If $\Phi_{0}\in C^{\infty}(M)=\cap_{\alpha>0}H^{\alpha}(M)$ and if we take priors $\Pi$ which verify Condition 3.1 for large enough $\alpha,\beta$ and $\mathcal{H}=H^{\alpha}(M)$ (possible by Remark 3.4), then we can make $\eta$ as close to $1/2$ as desired, and it is easy to show that $\eta=1/2$ cannot be improved upon by any algorithm. So at least for smooth $\Phi_{0}$ the recovery guarantee from Theorem 3.2 is (near-) optimal. In the ‘low regularity case’ where $\alpha$ is not large, our bound for $\eta$ may not be optimal. A conjecture for the optimal value for $\eta$ can be obtained from the much simpler linear and Abelian case ( $n=1$ ) corresponding to the classical Radon transform, which is treated in [32, Example 2.5], where the exponent $\eta=\alpha/(2\alpha+3)$ is attained, which can be shown to be optimal in this special case.

*Remark 3.4**.*

[Construction of Gaussian priors.] We describe here some Gaussian process priors verifying Condition 3.1 with $\mathcal{H}=H^{\alpha}(M)$ .

As a first basic example consider the case where $M$ equals the unit disk $D=\{(x_{1},x_{2})\in\mathbb{R}^{2}:x_{1}^{2}+x_{2}^{2}\leq 1\}$ in $\mathbb{R}^{2}$ with ‘flat’ (Euclidean) geometry, relevant in PNT. For arbitrary $\alpha>0$ we can then take for $\Pi^{\prime}$ the restriction to $D$ of a stationary Gaussian process on $\mathbb{R}^{2}$ with appropriate (Whittle-) Matérn covariance function $k_{\alpha}$ (see [17, p.313] and Section 4 below). This gives a Gaussian prior on $C(D)$ with RKHS $\mathcal{H}$ equal to the space of restrictions to $D$ of elements of $H^{\alpha}(\mathbb{R}^{2})$ (using Ex.2.6.5 in [19]). This space is well known (e.g., [42], Ch.4) to co-incide with $H^{\alpha}(D)$ , and the sample paths of this process lie in the separable subspace $C^{\beta_{0}}(D)$ of $C^{\beta}(D)$ for any $\beta<\beta_{0}<\alpha-1$ , see [17, p.575f] for a proof.

The preceding construction works for any smooth bounded domain $D$ in $\mathbb{R}^{2}$ . In particular a simple surface $M$ is diffeo-morphic to a disc and the Sobolev spaces $H^{\alpha}(D)$ and $H^{\alpha}(M)$ co-incide with equivalent norms – the Matérn prior can thus be used even when $M$ equals $D$ equipped with a different Riemannian metric. Alternatively one can embed $M$ isometrically into a larger closed compact (boundary-less) manifold $S$ and use the orthonormal basis of eigenfunctions $\{e_{k}\}$ of the Laplace-Beltrami operator on $S$ to generate Gaussian random series $f_{S}(x)=\sum_{k}\sigma_{k}g_{k}e_{k}(x)$ , $g_{k}\sim^{i.i.d.}N(0,1),~{}~{}x\in S,$ which after restriction to $M$ and for suitable choice of $\sigma_{k}>0$ , generate Gaussian priors $\Pi$ with any prescribed Sobolev space $H^{\alpha}(M)$ as RKHS.

*Remark 3.5**.*

[Rescaled Gaussian Priors.] While the use of Gaussian process techniques [4, 16, 27] in the proof of Theorem 3.2 is inspired by previous work in [44, 43] and also [18] for ‘direct’ problems, the inverse setting poses several challenges, particularly in the non-linear case. In our proofs we show how these challenges can be overcome by shrinking common Gaussian process priors towards the origin as in (13) – the shrinkage enforces the necessary additional ‘a-priori’ regularisation of the posterior distribution to permit the use of our stability estimates. While similar re-scaled priors have been shown to work in some ‘direct’ settings before (they appear as special cases of the rescaled priors studied in [43], see their Theorem 3.2), in our setting they play a crucial role: Without re-scaling the exponential growth in the $C^{1}$ -norms of $\Phi$ of the constant (6) would render our stability estimate useless in the proofs.

*Remark 3.6**.*

[Related literature on Bayesian non-linear inverse problems.] The study of statistical guarantees for the Bayesian approach to non-linear inverse problems has seen a recent surge of interest. In the references [45, 31, 30] non-linear inverse problems of elliptic and parabolic type are studied. The results therein however only hold for specific ‘uniformly bounded wavelet’ type priors – while these are useful to develop a first theoretical understanding of Bayesian inversion algorithms, they posit very strong a priori assumptions on the parameter of interest and the efficient computability of the resulting posterior distribution is also unclear.

The recent reference [32] obtains convergence rate results for optimisation based MAP-estimates (see Section 4.2 for a brief discussion of those) in a general class of non-linear inverse problems. For non-linear forward maps as the ones relevant here, these MAP-estimates can be difficult to compute, and at any rate may behave quite differently from the posterior mean: The algorithm $E^{\Pi}[\Phi|(Y_{i},(X_{i},V_{i}))_{i=1}^{N}]$ studied here is a Bochner integral with respect to an infinite-dimensional and non-Gaussian posterior distribution and variational ideas from optimisation cannot be used directly in its analysis. In the proof of Theorem 3.2 we develop new techniques that allow to prove convergence rates for such algorithms – see Section 5.4 for a discussion of the key ideas which are relevant in other settings, too. Indeed, the very recent references [1, 20] have already succeeded in adapting our proof template to other nonlinear inverse problems. For instance [1] study statistical versions of a conceptually related boundary value problem arising with electrical impedance tomography (‘Calderón problems’). Our results imply that statistical inversion of non-Abelian $X$ -ray transforms (for ‘smooth parameters’ $\Phi$ ) admits better (i.e., polynomial) convergence rates than the necessarily logarithmic (in inverse noise level) recovery guarantees derived in [1] for the Caldéron problem (with smooth conductivities).

*Remark 3.7**.*

[Towards Uncertainty Quantification.] Theorem 3.2 also serves as a starting point to prove more refined Bernstein-von Mises theorems that entail that the posterior distribution is approximated in a suitable infinite-dimensional space by a canonical Gaussian measure (cf. [5, 6]). For a non-linear elliptic inverse problem a first result of this kind was recently proved in [30], and for the linearisation of the non-linear problem considered here, such results were obtained in [29]. In principle, joining the ideas of [30, 29] with the techniques of the present paper, one can conjecture that Bernstein-von Mises theorems should also hold true for the case of non-Abelian $X$ -ray transforms – this is the subject of ongoing research.

4 Implementation of the algorithm

In this section, we present some numerical reconstructions of an $\mathfrak{su}(2)$ -valued matrix field $\Phi$ from its noisy scattering data $C_{\Phi}\in SU(2)$ . In this case, $\Phi$ is generated by three real-valued components $B_{1},B_{2},B_{3}$ , through the relation $\Phi=B_{1}\ \sigma_{1}+B_{2}\ \sigma_{2}+B_{3}\ \sigma_{3}$ , where we have defined for basis of $\mathfrak{su}(2)$

[TABLE]

with structure equations $[\sigma_{1},\sigma_{2}]=\sigma_{3}$ , $[\sigma_{2},\sigma_{3}]=\sigma_{1}$ and $[\sigma_{3},\sigma_{1}]=\sigma_{2}$ . The approach presented easily adapts to any ${\mathfrak{s}}{\mathfrak{o}}(n)$ -, $\mathfrak{su}(n)$ - or ${\mathfrak{u}}(n)$ -valued field (including the ${\mathfrak{s}}{\mathfrak{o}}(3)$ -valued case of polarimetric neutron tomography, a close cousin of the present case), with some minor Lie group specific modifications to be made for an accurate computation of forward data.

4.1 Numerical domain and forward operator

The computational domain is an unstructured triangular mesh discretising the unit disk $M=\{x^{2}+y^{2}\leq 1\}$ made of $N_{v}$ vertices, and functions on it are piecewise linear, uniquely determined by their values at the vertices. In particular $\Phi$ is regarded as an element of $\mathbb{R}^{3N_{v}}$ .

The metric is isotropic, written as $g=e^{2\bar{\lambda}(x,y)}\text{id}$ , with scalar function $\bar{\lambda}$ given by

[TABLE]

Such an example can be seen to be non-trapping, have no conjugate points and a strictly convex boundary, i.e. $(M,g)$ is simple. The case of Euclidean geometry would correspond to $\bar{\lambda}\equiv 0$ . Geodesic (data) space, modelled as $\partial_{+}SM$ is parameterised in fan-beam coordinates $(\beta,\alpha)\in(0,2\pi)\times(-\pi/2,\pi/2)$ (with uniform probability measure $d\lambda=d\beta\ d\alpha/(2\pi^{2})$ ).

Below we will draw $N$ geodesics uniformly at random, characterised by $N$ initial conditions $(\alpha_{i},\beta_{i})\in\partial_{+}SM$ , $1\leq i\leq N$ , and our statistical algorithm will require numerical evaluation of the forward data $C_{\Phi}(\alpha_{i},\beta_{i})$ which we now describe: Out of each data point $(\alpha_{i},\beta_{i})$ , we first compute a geodesic using a forward scheme with stepsize $h$ to solve a discretisation of the system

[TABLE]

with initial condition $x(0)=\cos\beta_{i}$ , $y(0)=\sin\beta_{i}$ and $\theta(0)=\beta_{i}+\pi+\alpha_{i}$ , until the geodesic exits the domain. This produces a discretised geodesic

[TABLE]

Once such a geodesic is computed, we must then discretise the matrix ODE

[TABLE]

(The problem here is forward in time unlike that given in the introduction, though since $\Phi$ is ${\mathfrak{u}}(n)$ -valued, this amounts to computing the conjugate transpose of $C_{\Phi}$ , which leads to the same problem.)

To discretise the above ODE, we denote $U^{(i,j)}:=U(\gamma_{i}(t_{j}),\dot{\gamma}_{i}(t_{j}))$ and implement the scheme

[TABLE]

where we have defined $\Phi^{(i,j-1)}=\Phi(x_{i}(t_{j-1}),y_{i}(t_{j-1}))$ . In fact the code implements a predictor-corrector variant of this scheme for improved accuracy on the computation of the exponentials.

The use of matrix exponentials in (15) (compared to standard forward-marching schemes) ensures that the matrix solution $U$ numerically remains in $SU(2)$ , and the computation of these exponentials can be done via an explicit formula, namely: for $A=a\ \sigma_{1}+b\ \sigma_{2}+c\ \sigma_{3}$ and denoting $|a|:=\sqrt{a^{2}+b^{2}+c^{2}}$ , we have for $l\in\mathbb{R}$

[TABLE]

(Note that the formula above would need to be adapted if a Lie algebra $\mathfrak{g}$ different from $\mathfrak{su}(2)$ is of interest.) The evaluation of $\Phi^{(i,j-1)}$ is done by barycentric combination of the values of $\Phi$ at the three vertices of the triangle containing $(x_{i}(t_{j-1}),y_{i}(t_{j-1}))$ .

After implementing (15), the scattering data $C_{\Phi}(\gamma_{i})$ is nothing but $U^{(i,J_{i})}$ (in fact, the other values $U^{(i,j)}$ for $j<J_{i}$ are not kept in memory after computation). The magnetic field $\Phi$ we will use in the experiments below as well as its noiseless scattering data $C_{\Phi}$ are visualised Fig. 2.

As we will use Monte-Carlo Markov Chains (MCMC) in the following section, let us mention that once the mesh is fixed, some computations are done prior to the MCMC, namely, all geodesics as well as the triangle indices and barycentric weights along them.

4.2 Statistical estimation through MCMC

Given data as in (3), a common approach to inverse problems would be to compute a Tikhonov regulariser which minimises a penalised least squares fit functional (with, e.g., Sobolev-norm penalty)

[TABLE]

over the space of all matrix fields $\Phi:M\to\mathfrak{g}$ where $\mathfrak{g}$ is the Lie algebra describing the constraint on the co-domain of $\Phi$ . The map $Q_{N}$ is not convex, and efficient computation of the global minimiser may be challenging. One approach would be to use a gradient based iterative scheme [25] but the algorithmic stability of these (or other variational) methods is unclear in the setting considered here.

The optimiser of the functional (16) can be shown to correspond to a posterior mode, or ‘maximum a posteriori estimate (MAP)’, of a Gaussian process prior $\Pi$ on $C(M,\mathfrak{g})$ with RKHS equal to $H^{\alpha}$ (see [9] for a general result of this kind). Instead of computing that maximiser, one may compute other posterior characteristics such as the posterior mean (average) $E^{\Pi}[\Phi|D_{N}]=E^{\Pi}[\Phi|(Y_{i},(X_{i},V_{i}))_{i=1}^{N}]$ , which in our non-linear setting is different from the MAP estimate.

For Gaussian priors, MCMC algorithms such as the preconditioned Crank-Nicolson (pCN) method (see [7]) are available to sample from the posterior distribution. To introduce the algorithm, note that as in (12), the log-likelihood function given the data $(Y_{i},(X_{i},V_{i}))_{i=1}^{N}$ equals, up to additive constants,

[TABLE]

One then approximates the posterior mean $E^{\Pi}[\Phi|(Y_{i},(X_{i},V_{i}))_{i=1}^{N}]$ by a Monte Carlo average $\widehat{\Phi}=\frac{1}{N_{s}}\sum_{n=0}^{N_{s}}\Phi_{n}$ of a Markov chain $(\Phi_{n})$ of length $N_{s}$ as follows:

Let $\Pi$ be a Gaussian prior for $\Phi$ ; initialise $\Phi_{n}=0$ for $n=0$ , then repeat:

Draw $\Psi\sim\Pi$ and for $\delta>0$ define the proposal $p_{\Phi_{n}}:=\sqrt{1-2\delta}\ \Phi_{n}+\sqrt{2\delta}\ \Psi$ . 2. 2.

Set

[TABLE]

The algorithm is terminated at $n=N_{s}$ and requires evaluation of $\ell(\Phi_{n})$ and thus of the scattering data $C_{\Phi_{n}}(X_{i},V_{i})$ for every $\Phi_{n}$ and $(X_{i},V_{i})$ . For $\mathfrak{g}=\mathfrak{su}(2)$ relevant in the simulations that follow, this can be done as described in Section 4.1.

The invariant measure of the Markov chain $\{\Phi_{n}\}$ equals the posterior distribution $\Pi(\cdot|D_{N})$ , and under certain conditions that are compatible with our setting, [22] derived dimension-free spectral gaps which imply that the distribution of $\Phi_{n}$ mixes rapidly towards $\Pi(\cdot|D_{N})$ . The approximation of $E^{\Pi}[\Phi|D_{N}]$ by $\widehat{\Phi}=\frac{1}{N_{s}}\sum_{n=0}^{N_{s}}\Phi_{n}$ can thus be expected to compare to the one of the standard central limit theorem, with corresponding non-asymptotic error guarantees, see Section 4 in [22].

To perform numerical simulations, we discretise $\Phi=\sum_{i=1}^{3}B_{i}\sigma_{i}:M\to\mathfrak{su}(2)$ as in Section 4.1 and for each $B_{i}$ choose an independent Matérn prior (cf. Remark 3.4) with parameters $(\nu,\ell)$ , which on functions on the mesh (i.e., vectors in $\mathbb{R}^{N_{v}}$ ) uses the covariance matrix $C_{i,j}=k_{\nu,\ell}(|x_{i}-x_{j}|)$ for $1\leq i,j\leq N_{v}$ , with positive definite kernel

[TABLE]

with $K_{\nu}$ the modified Bessel function of the second kind. The constant $\nu$ controls the Sobolev regularity while $\ell$ controls the characteristic lengthscale of the samples.

We draw $N$ geodesics at random according to the uniform law for $(\alpha,\beta)$ (some samples on $\partial_{+}SM$ of size $N=200,400,800$ are visualised Fig. 4), and then generate synthetic data $(Y_{i},(X_{i},V_{i}))_{i=1}^{N}$ as explained in Section 4.1 for the magnetic field $\Phi_{0}$ displayed in Fig. 2, adding Gaussian noise $N(0,\sigma^{2})$ to each matrix entry of $C_{\Phi_{0}}$ .

We then implement the pCN algorithm to approximately compute the posterior mean $\bar{\Phi}_{N}=E^{\Pi}[\Phi|(Y_{i},(X_{i},V_{i}))_{i=1}^{N}]$ from Theorem 3.2. The stepsize $\delta$ is adjusted so that after ‘burn-in’, the acceptance rate of proposals stabilises around $25\%$ . Once the chain is computed we visualise $\widehat{\Phi}=\frac{1}{N_{s}}\sum_{n=0}^{N_{s}}\Phi_{n}$ – examples of outcomes corresponding to increasing data set are given in Fig. 5, illustrating the improvement in ‘reconstructions’ as the number $N$ of measurement points increases.

5 Proofs

5.1 Geometric preliminaries

Let $(M,g)$ be a compact oriented two dimensional Riemannian manifold with smooth boundary $\partial M$ . As before $SM$ will denote the unit circle bundle which is a compact 3-manifold with boundary given by

[TABLE]

We let $X$ be the geodesic vector field, i.e. the infinitesimal generator of the geodesic flow of $M$ . Since $M$ is assumed oriented there is a circle action on the fibers of $SM$ with infinitesimal generator $V$ called the vertical vector field. It is possible to complete the pair $X,V$ to a global frame of $T(SM)$ by considering the vector field $X_{\perp}:=[X,V]$ . There are two additional structure equations given by $X=[V,X_{\perp}]$ and $[X,X_{\perp}]=-\kappa V$ where $\kappa$ is the Gaussian curvature of the surface. Using this frame we can define a Riemannian metric on $SM$ by declaring $\{X,X_{\perp},V\}$ to be an orthonormal basis and the volume form of this metric will be denoted by $d\Sigma^{3}$ . The fact that $\{X,X_{\perp},V\}$ are orthonormal together with the commutator formulas implies that the Lie derivative of $d\Sigma^{3}$ along the three vector fields vanishes.

Given functions $u,v:SM\to{\mathbb{C}}^{n}$ we consider the inner product

[TABLE]

Upon defining $\mu(x,v):=-g_{x}(v,\nu_{x})$ for $(x,v)\in\partial SM$ , the following formula (known as Santaló’s formula) holds for any $f\in L^{1}(SM)$ :

[TABLE]

where $\varphi_{t}$ is the geodesic flow.

We now discuss the manifold $\partial_{+}SM$ and its geometry. One may define a natural frame on $\partial_{+}SM$ , given by

[TABLE]

( $T$ represents horizontal differentiation along the tangent direction). It is easily seen that $[V,T]=0$ and that these two vector fields are orthonormal for the metric on $\partial SM$ induced by the metric defined on $SM$ . In particular $(T,V)$ is an orthonormal frame for $\partial_{+}SM$ and we may define $H^{s}(\partial_{+}SM;\cdot)$ with respect to that frame. We now prove a useful lemma that will simplify later calculations.

Lemma 5.1.

Let $(M,g)$ be a non-trapping surface with strictly convex boundary. Then the vector field $X$ can be completed into a global, pairwise commuting frame $\{X,P_{T},P_{V}\}$ of $T(SM)$ . This frame is smooth on $SM\backslash S\partial M$ , continuous on $SM$ and satisfies $P_{T}|_{\partial_{+}SM}=T$ and $P_{V}|_{\partial_{+}SM}=V$ .

Proof of Lemma 5.1.

For $(x,v)\in\partial_{+}SM\backslash S\partial M$ and $t\in(0,\tau(x,v))$ , we define two vector fields on $SM^{int}$

[TABLE]

Since the map $(x,v,t)\mapsto\varphi_{t}(x,v)$ is smooth and injective for $(x,v)\in\partial_{+}SM\backslash S\partial M$ and $t\in(0,\tau(x,v))$ , this defines global, smooth sections of $T(SM^{int})$ , and so that $X,P_{T},P_{V}$ pairwise commute. Via direct computation of the differential of the flow (see e.g. [28, Sec. 4.2]), one may obtain the following expressions on $SM^{int}$

[TABLE]

where $\boldsymbol{a},\boldsymbol{b}\colon SM\to\mathbb{R}$ satisfy

[TABLE]

and where for $h:\partial_{+}SM\to{\mathbb{C}}$ , one defines $h_{\psi}:SM\to{\mathbb{C}}$ though the relation

[TABLE]

One further notices that the definition of $P_{V},P_{T}$ extends by continuity to $\partial(SM)$ , with the appropriate restrictions claimed in the statement of the lemma. ∎

5.2 Forward estimates - proof of Theorem 2.2

In this section, we derive various continuity estimates for the forward map $\Phi\mapsto C_{\Phi}$ . Recall that if the boundary $\partial M$ is strictly convex, by [39, Lemma 4.1.2 p113] there is a constant $C_{0}(M,g)>0$ such that

[TABLE]

We start with the following basic estimates.

Lemma 5.2 (Work-horse lemma).

Let $(M,g)$ be a non-trapping surface with strictly convex boundary and $\Phi\in C(M,\mathfrak{u}(n))$ . Suppose $F\in C(SM,{\mathbb{C}}^{n\times n})$ and consider the unique continuous solution $G:SM\to{\mathbb{C}}^{n\times n}$ to $XG+\Phi G=F$ on $SM$ with $G|_{\partial_{-}SM}=0$ . Then there exists a constant $C_{1}(M,g)$ such that

[TABLE]

The constant $C_{1}$ can be chosen as $C_{1}=\max(\tau_{\infty},\sqrt{C_{0}})$ , with $\tau_{\infty}$ the diameter of $M$ and $C_{0}$ the constant given in (20).

Proof.

It is easy to check that

[TABLE]

where $U_{\Phi}$ is the unique solution $U$ to $XU+\Phi U=0$ on $SM$ with $U|_{\partial_{+}SM}=\text{id}$ . Taking Frobenius norm, using $U(n)$ -invariance and the fact that that $U_{\Phi}$ is unitary, we get

[TABLE]

Upon bounding the right-hand side crudely by $\tau_{\infty}\|F\|_{L^{\infty}}$ , this immediately implies (21). On to the $L^{2}$ estimates, applying Cauchy-Schwarz yields for all $(x,v)\in SM$

[TABLE]

where $\gamma_{x,v}$ is the maximal geodesic passing through $(x,v)$ . Now fix $(x,v)\in\partial_{+}SM$ and integrate the inequality above along the geodesic flow $\varphi_{t}(x,v)$ to arrive at

[TABLE]

Multiplying both sides by $\mu$ , integrating w.r.t. $d\Sigma^{2}$ and using Santaló’s formula yields (22).

For the estimate on $L^{2}(\partial_{+}SM)$ , looking at (24) for $(x,v)\in\partial_{+}SM$ and using (20), we arrive at

[TABLE]

Integrating w.r.t. $d\Sigma^{2}$ and using Santaló’s formula (18) on the right hand side immediately gives (23). Lemma 5.2 is proved. ∎

We now prove the main result on forward estimates, Theorem 2.2. We shall follow the model proof of [39, Theorem 4.2.1] which shows that the standard X-ray transform $I$ maps $H^{s}$ to $H^{s}$ . We do this in two stages: first we explain in Sec. 5.2.1 the proof in the simpler case in which the matrix fields have support contained in the interior of $M$ and then we explain in Sec. 5.2.2 how to derive the general case.

5.2.1 Proof of Theorem 2.2 assuming $\Phi$ and $\Psi$ with support in the interior of $M$

As a preliminary identity, given $\Phi$ and $\Psi$ two skew hermitian matrix fields, consider the two $U(n)$ -valued solutions $U_{\Phi},U_{\Psi}$ such that $XU_{\Phi}+\Phi U_{\Phi}=0$ with boundary condition $U_{\Phi}|_{\partial_{-}SM}=\text{id}$ . It is immediate to find that the relation

[TABLE]

holds pointwise on $SM$ , and that $(U_{\Phi}-U_{\Psi})|_{\partial_{-}SM}=0$ . Using that $(U_{\Phi}-U_{\Psi})|_{\partial_{+}SM}=C_{\Phi}-C_{\Psi}$ with estimate (21) yields

[TABLE]

Similarly, combining the observation with (23) yields (7), and we can also obtain, using (22),

[TABLE]

To prove the $C^{1}$ continuity estimate, consider the function $W:=P_{V}(U_{\Phi}-U_{\Psi})$ , such that $W|_{\partial_{+}SM}=V(C_{\Phi}-C_{\Psi})$ and for brevity set $P=P_{V}$ . The following identity is immediate:

[TABLE]

In addition, since $\Phi$ and $\Psi$ are compactly supported in $M^{\text{int}}$ , the functions $U_{\Phi}$ , $U_{\Psi}$ equal the identity matrix in a neighbourhood of $\partial_{-}SM$ and in particular, $W|_{\partial_{-}SM}=0$ .

Using estimates (21)-(22)-(23) and $U(n)$ -invariance of Frobenius norms gives:

[TABLE]

We also have $X(PU_{\Psi})+\Psi PU_{\Psi}=-(P\Psi)U_{\Psi}$ with $PU_{\Psi}|_{\partial_{-}SM}=0$ , so by (21), we get $\lVert PU_{\Psi}\rVert_{L^{\infty}}\leq C_{1}\lVert P\Psi\rVert_{L^{\infty}}$ . Combining this fact with (25), we arrive at

[TABLE]

and a similar bound for $\lVert P_{V}(U_{\Phi}-U_{\Psi})\rVert_{L^{2}}$ . Obtaining a similar estimate for $T(C_{\Phi}-C_{\Psi})$ , we arrive at

[TABLE]

Similar arguments using sup norms everywhere yield

[TABLE]

To proceed to higher-order derivatives, if $\boldsymbol{P}^{{\boldsymbol{\alpha}}}=P_{V}^{\alpha_{1}}P_{T}^{\alpha_{2}}$ is a derivative of order $|\alpha|$ , setting $W=\boldsymbol{P}^{{\boldsymbol{\alpha}}}(U_{\Phi}-U_{\Psi})$ , we have $W|_{\partial_{+}SM}=V^{\alpha_{1}}T^{\alpha_{2}}(C_{\Phi}-C_{\Psi})$ , $W|_{\partial_{-}SM}=0$ and

[TABLE]

where the right-hand-side involves derivatives of $\Phi$ of order at most $|{\boldsymbol{\alpha}}|$ , and derivatives of $U_{\Phi}-U_{\Psi}$ of order at most $|{\boldsymbol{\alpha}}|-1$ . Combining the estimates of Lemma 5.2 and an induction on $k$ (whose formulation also involves control on $\lVert P_{V}^{\alpha_{1}}P_{T}^{\alpha_{2}}(U_{\Phi}-U_{\Psi})\rVert_{L^{2}(SM)}$ for all $\alpha_{1}+\alpha_{2}\leq k$ , and where the commuting frame $\{X,P_{V},P_{T}\}$ avoids the proliferation of terms due to non-trivial commutators) proves the theorem for higher-order derivatives. ∎

5.2.2 Proof of Theorem 2.2 $\Phi$ and $\Psi$ supported up to $\partial M$

Consider a compact non-trapping surface $(M,g)$ with strictly convex boundary and let $\Phi\in C(M,{\mathbb{C}}^{n\times n})$ be a matrix-valued field. We shall call $R_{\Phi}\in C(SM,GL(n,{\mathbb{C}}))$ an integrating factor for $\Phi$ if $R_{\Phi}$ is differentiable along the geodesic vector field $X$ and $XR_{\Phi}+\Phi R_{\Phi}=0$ . Let $U_{\Phi}$ denote the unique integrating factor with $U_{\Phi}|_{\partial_{-}SM}=\text{id}$ . Recall that $C_{\Phi}=U_{\Phi}|_{\partial_{+}SM}$ . First note that the work of the previous section also proves for every $k\geq 0$ that if $\Phi$ and $\Psi$ are $C^{k}$ matrix fields compactly supported inside of $M^{int}$ , we also have

[TABLE]

Let $\alpha:\partial SM\to\partial SM$ denote the scattering relation of the metric (i.e. the map that takes initial conditions of a geodesic at the moment of entry to final conditions at the moment of exit). If $R_{\Phi}$ denotes any other integrating factor for $\Phi$ , then it must have the form $U_{\Phi}F^{\sharp}$ , where $F^{\sharp}$ is the first integral (i.e. $XF^{\sharp}=0$ ) determined by $F\in C(\partial_{+}SM,GL(n,{\mathbb{C}}))$ . Thus $R_{\Phi}=U_{\Phi}F^{\sharp}$ and from this we deduce

[TABLE]

In particular, given two continuous matrix fields $\Phi$ , $\Psi$ , Equation (27) implies the identity on $\partial_{+}SM$ :

[TABLE]

To complete the proof of Theorem 2.2 for $\Phi,\Psi$ supported up to the boundary, we then need to construct integrating factors with good regularity on $SM$ (i.e, at $\partial_{0}SM$ included) and which behave continuously in terms of $\Phi$ and $\Psi$ . To this end, we consider $(M,g)$ isometrically embedded in a closed manifold $(S,g)$ . The Seeley extension theorem asserts that for any $k\geq 0$ there is a continuous extension map

[TABLE]

(It also works for $C^{\infty}$ .) We consider a slightly larger compact manifold with boundary $\widetilde{M}\subset S$ engulfing $M$ so that $(\widetilde{M},g)$ stays non-trapping and with strictly convex boundary. We fix once and for all a smooth cut off function $\chi$ so that it has compact support in $\widetilde{M}^{int}$ and it equals $1$ near $M$ . Thus given $\Phi\in C^{k}(M,\mathfrak{u}(n))$ ,

[TABLE]

and since $E_{k}$ is continuous,

[TABLE]

Now by virtue of the work in Subsection 5.2.1 applied to $\widetilde{\Phi}$ on $\widetilde{M}$ , we can deduce estimates of the form

[TABLE]

We then take as smooth integrating factors $R_{\Phi}:=U_{\widetilde{\Phi}}|_{SM}$ and $R_{\Psi}:=U_{\widetilde{\Psi}}|_{SM}$ . Combining (29) and 30 we derive

[TABLE]

Combining (29) and (26) applied to $U_{\widetilde{\Phi}}$ and $U_{\widetilde{\Psi}}$ , we obtain

[TABLE]

and similarly for $\|R_{\Phi}^{-1}-R_{\Psi}^{-1}\|_{C^{k}}$ , and for $H^{k}$ norms. Then the proof for Theorem 2.2 for $\Phi,\Psi$ supported up to the boundary consists in applying the product rule to (28) and using estimates (31) and (32).

5.3 Stability estimate - proof of Theorem 2.1

5.3.1 Setting, main results and proofs of Theorem 2.1 and Corollary 2.3

Before considering the non-linear inverse problem, we must establish a stability estimate for a linear inverse problem, that of reconstructing a function $f\in C^{\infty}(M,{\mathbb{C}}^{n})$ from its attenuated X-ray transform, where the attenuation is matrix-valued. Namely, given $\Phi$ a smooth skew-hermitian matrix in $M$ , we define $I_{\Phi}f:=u^{f}|_{\partial_{+}SM}$ , where $u=u^{f}:SM\to{\mathbb{C}}^{n}$ is the unique solution to the problem

[TABLE]

The injectivity of such a transform was proved in [35], and we now provide a stability estimate for it.

Theorem 5.3.

Let $(M,g)$ be a simple Riemannian surface with boundary and $\Phi$ a smooth, skew-hermitian matrix field in $M$ . Then for any $f\in C^{\infty}(M)$ , we have the following stability estimate

[TABLE]

*Remark 5.4** (Dependence of $C_{1},C_{2}$ ).*

The constants $C_{1},C_{2}$ only depend on the geometry of $(M,g)$ . The constant $C_{1}$ blows up like $(\beta-1)^{-1}$ , where $\beta$ is the terminator constant of $(M,g)$ . This is one of the ways that this stability estimate ceases to hold as one approaches non-simplicity. The main other quantity appearing in $C_{1},C_{2}$ is $w_{\infty}$ , the sup norm of the integrating factor defined below. The behavior of such a quantity, while finite on any simple surface, remains to be better understood.

On to the non-linear stability estimate, injectivity of the operator $\Phi\to C_{\Phi}$ restricted to ${\mathfrak{u}}(n)$ -valued fields was initially proved in [35], and Theorem 2.1 upgrades this result with a stability estimate. While the remaining sections will focus on the proof of Theorem 5.3, we now explain how this result implies Theorem 2.1. The main additional ingredient needed is a pseudo-linearization identity, relating scattering data to attenuated X-ray transforms:

Lemma 5.5 (Pseudo-linearization).

Let $(M,g)$ be a non-trapping surface with strictly convex boundary. For any $\Phi,\Psi\in C(M,{\mathbb{C}}^{n\times n})$ , the following relation holds

[TABLE]

where $I_{\Theta(\Phi,\Psi)}\colon L^{2}(M,{\mathbb{C}}^{n\times n})\to L^{2}(\partial_{+}SM,{\mathbb{C}}^{n\times n})$ is an attenuated X-ray transform with matrix field $\Theta(\Phi,\Psi)$ , an endomorphism of ${\mathbb{C}}^{n\times n}$ with pointwise action

[TABLE]

Proof of Lemma 5.5.

With $U_{\Phi},U_{\Psi}$ the fundamental solutions of $XU_{\Phi}+\Phi U_{\Phi}=0$ with $U_{\Phi}|_{\partial_{-}SM}=\text{id}$ and $U_{\Phi}|_{\partial_{+}SM}=C_{\Phi}$ (similarly for $\Psi$ ), denote $W:=U_{\Phi}U_{\Psi}^{-1}-\text{id}$ . A direct computation shows that

[TABLE]

and thus by the definition of the attenuated X-ray transform, $W|_{\partial_{+}SM}=I_{\Theta(\Phi,\Psi)}(\Phi-\Psi)$ . Since we also have by construction $W|_{\partial_{+}SM}=C_{\Phi}C_{\Psi}^{-1}-\text{id}$ , identity (34) follows. ∎

Proof of Theorem 2.1.

Appealing to the pseudo-linearization (34), one may notice that if $\Phi$ , $\Psi$ are skew-hermitian, then the field $\Theta(\Phi,\Psi)$ is skew-hermitian when viewed as an endomorphism of ${\mathbb{C}}^{n\times n}$ . Moreover, since the entries of $\Theta(\Phi,\Psi)$ are linear in the entries of $\Phi$ and $\Psi$ , we directly have that

[TABLE]

with $C$ a universal constant. Then relation (34), together with Theorem 5.3 immediately implies

[TABLE]

This shows Theorem 2.1 when $\Phi,\Psi\in C^{\infty}(M,{\mathfrak{u}}(n))$ . Since all quantities involved above do not depend on derivatives of $\Phi,\Psi$ of order higher than $1$ , and $C_{1},C_{2}$ are independent of $\Phi,\Psi$ , approximating $\Phi,\Psi\in C^{1}(M,{\mathfrak{u}}(n))$ by sequences in $C^{\infty}(M,{\mathfrak{u}}(n))$ (and using Theorem 2.2) will yield the same stability estimate for $C^{1}$ matrix fields. ∎

We also cover the proof of Corollary 2.3, based on the previous result and the forward estimate Theorem 2.2.

Proof of Corollary 2.3.

It is enough to show that

[TABLE]

To show this, we write at the pointwise level:

[TABLE]

hence $\|C_{\Phi}-C_{\Psi}\|_{L^{2}}=\|C_{\Phi}C_{\Psi}^{-1}-\text{id}\|_{L^{2}}$ . To control first derivatives, take $P=V$ or $T$ , we have

[TABLE]

using triangle inequality and submultiplicativity. Squaring, taking the sup norm of $\left|PC_{\Psi}\right|_{F}$ and integrating on $\partial_{+}SM$ , we obtain

[TABLE]

Combining the estimates for $P=V$ and $P=T$ we arrive at

[TABLE]

Now using the forward estimate (8) with $k=1$ and $\Phi\equiv 0$ (thus $C_{\Phi}=\text{id}$ ), we deduce that

[TABLE]

This yields the estimate $\|C_{\Phi}C_{\Psi}^{-1}-\text{id}\|_{H^{1}}^{2}\lesssim(1+\|\Psi\|_{C^{1}}^{2})\lVert C_{\Phi}-C_{\Psi}\rVert^{2}_{H^{1}}$ , and taking squareroots yields (35) (using that $\sqrt{1+x^{2}}/(1+x)$ is uniformly bounded for $x\in[0,\infty)$ ). ∎

5.3.2 Proof of Theorem 5.3 - Main outline

As in [35], the main method of proof involves an energy identity (or Pestov identity), based on integrations by parts on $SM$ . To do this, let us recall that with the inner product $(u,v)$ defined in (17), and upon also denoting

[TABLE]

the following integrations by parts formulas holds for $u,v\in C^{\infty}(SM,{\mathbb{C}}^{n})$ :

[TABLE]

We will also use extensively the harmonic decomposition on the fibers of $SM$ . Namely, the space $L^{2}(SM,{\mathbb{C}}^{n})$ decomposes orthogonally as a direct sum

[TABLE]

where $H_{k}$ is the eigenspace of $-iV$ corresponding to the eigenvalue $k$ . A function $u\in L^{2}(SM,{\mathbb{C}}^{n})$ has a Fourier series expansion

[TABLE]

where $u_{k}\in H_{k}$ . Let $\Omega_{k}=C^{\infty}(SM,{\mathbb{C}}^{n})\cap H_{k}$ . Of special interest are the operators

[TABLE]

with the property that $\eta_{\pm}(\Omega_{k})\subset\Omega_{k\pm 1}$ for all $k\in\mathbb{Z}$ . For more details on the operators $\eta_{\pm}$ and the Fourier expansion we refer to [21] where these tools were first introduced.

Definition 5.6.

A function $u:SM\to{\mathbb{C}}^{n}$ is said to be holomorphic if $u_{k}=0$ for all $k<0$ . Similarly, $u$ is said to be antiholomorphic if $u_{k}=0$ for all $k>0$ .

To control the terms involving the matrix field, one must introduce an artificial connection as we will see below. This first requires that we derive a Pestov identity for X-ray transforms with connection $A$ and matrix222The matrix field $\Phi$ is also referred to as a ’Higgs’ field in the literature. field $\Phi$ . Namely, given a skew hermitian pair $(A,\Phi)$ on the bundle $M\times{\mathbb{C}}^{n}$ and $f\in C^{\infty}(M,{\mathbb{C}}^{n})$ , we define $I_{A,\Phi}f=u|_{\partial_{+}SM}$ , where $u$ is the unique solution to the problem

[TABLE]

While previous Pestov identities have been derived in [35], the present one accounts for nonzero boundary terms, and in particular reflects more precisely how the stability constant degrades as $(M,g)$ approaches non-simplicity. This is captured by the concept of terminator constant $\beta_{\text{Ter}}$ : given a simple surface $(M,g)$ , there exists a number $\beta_{\text{Ter}}>1$ such that for any $\beta\in(1,\beta_{\text{Ter}}]$ , there exists a smooth function $r=r_{\beta}:SM\to{\mathbb{R}}$ , solution to the Riccati type equation $Xr+r^{2}+\beta\kappa=0$ .

Theorem 5.7.

Let $(M,g)$ a simple surface with boundary, with terminator constant $\beta_{\text{Ter}}>1$ , and $(A,\Phi)$ a skew-hermitian pair on the bundle $M\times{\mathbb{C}}^{n}$ . Then for any $u\in C^{\infty}(SM,{\mathbb{C}}^{n})$ and $\beta\in(1,\beta_{\text{Ter}}]$ , the following identity holds:

[TABLE]

In the identity above,

[TABLE]

and $r$ is a smooth function on $SM$ which only depends on the surface. The quantity $\star F_{A}$ is the curvature of the connection $A$ , which upon a judicious choice of connection, can have a controlled sign. To achieve this, consider the scalar Hermitian connection $a:=i\varphi\text{id}$ , where $\varphi$ is a smooth 1-form such that $d\varphi=\omega_{g}$ (the area form of the metric $g$ ). We choose a specific $\varphi$ of the form $\varphi=\star dh$ for $h$ a real-valued function satisfying $\star d\star dh=1$ with Neumann condition $dh(\nu)=0$ at the boundary. The latter condition implies that $\nabla_{T,sa}u=Tu$ for any real $s$ . Then we have

[TABLE]

with $\eta_{\pm}$ defined in (37), and $i\star F_{a}=-1$ .

By [35], we can construct a holomorphic scalar function $w\in C^{\infty}(SM)$ satisfying $Xw=-iX_{\perp}h$ . Without loss of generality, $w$ can be chosen even. The condition on $w_{0}$ reads $\eta_{-}(w_{0}-h)=0$ , for which it is sufficient to use $w_{0}=h$ . With this choice of $a$ and $s\in\mathbb{R}$ , in what follows, we will denote $G_{s}:=X+sa+\Phi$ and $G=G_{0}$ . With $w$ as above, we have $G_{s}u=e^{sw}G(e^{-sw}u)$ . Moreover, $\overline{w}$ (the complex-conjugate of $w$ ) is antiholomorphic and solves $X\overline{w}=+iX_{\perp}h$ , so also $G_{s}u=e^{-s\overline{w}}G(e^{s\overline{w}}u)$ .

Lastly, we will denote $\Pi_{\pm}$ the projection onto positive and negative harmonics. Namely, $\Pi_{\pm}u=\sum_{\pm k>0}u_{k}$ . We have the following commutators formulas, for any $u\in C^{\infty}(SM)$ :

[TABLE]

The following lemma will help us controlling $u$ by versions of $u$ which are conjugated by special integrating factors.

Lemma 5.8.

With the holomorphic function $e^{sw}$ and antiholomorphic function $e^{-s^{\prime}\bar{w}}$ and any $s,s^{\prime}\in{\mathbb{R}}$ , we have

[TABLE]

in particular we get the equality

[TABLE]

Proof.

We only prove $\Pi_{-}u=\Pi_{-}(e^{-sw}\Pi_{-}(e^{sw}u)))$ , and the rest is similar. It is enough to notice that for any holomorphic function $f$ , the equality $\Pi_{-}(fu)=\Pi_{-}(f\Pi_{-}u)$ holds, as this amounts to saying that the negative harmonics of $fu$ do not depend on the non-negative harmonics of $u$ . This is immediate since

[TABLE]

Then we compute immediately

[TABLE]

hence the result. ∎

Outline of proof of Theorem 5.3 At first we are going to assume that the solution $u$ to the transport problem $Xu+\Phi u=-f$ , $u|_{\partial_{-}SM}=0$ is $C^{\infty}$ . If $f$ is supported all the way to the boundary, this may not be the case, as $u$ may fail to be smooth at the glancing $\partial_{0}SM$ because $\tau$ is not smooth at $\partial_{0}SM$ . However, there is a standard way to fix this issue and we shall do this at the very end. For now we will proceed as if $u$ were smooth in $SM$ .

The initial transport equation, projected onto the harmonic term of degree [math], reads

[TABLE]

so that, in particular,

[TABLE]

The crux is then to find how to bound the quantities on the right by the boundary values of $u$ . Using a Pestov identity with a special connection $sa$ defined as above (and its holomorphic integrating factor $e^{sw}$ ), we show how to control the first term using control over $\Pi_{-}(e^{sw}u))$ for $s>0$ . Similar work can be done, to control the second term using control over $\Pi_{+}(e^{-s^{\prime}\bar{w}}u)$ for $s^{\prime}<0$ .

We first derive in Sec. 5.3.3 the identity:

[TABLE]

Since $(\eta_{+}-sa_{1})(e^{-sw})$ only has strictly positive harmonic terms, the first term in the right-hand side of (42) only depends on $\Pi_{-}(e^{sw}u)$ . Upon defining $v_{s}:=\Pi_{-}(e^{sw}u)$ , the identity (42) reads

[TABLE]

Denoting $w_{\infty}=\sup_{SM}|w|$ , we straightforwardly obtain the estimate

[TABLE]

and control on $\|\eta_{+}u_{-1}+\frac{1}{2}\Phi u_{0}\|^{2}$ will be obtained after controlling each term in the last right hand side. We first control $\|(e^{s(w-w_{0})}u)_{0}\|^{2}$ by $\|v_{s}\|^{2}+\|G_{s}Vv_{s}\|^{2}$ , via the estimate

[TABLE]

We then control $\|v_{s}\|^{2}$ and $\|G_{s}Vv_{s}\|^{2}$ by boundary terms via Pestov identity and setting up an appropriate threshold on $s$ . To do this, we consider the transport problem for $v_{s}$ , written as:

[TABLE]

We then use the Pestov identity (38) for $v_{s}$ , with $\star F_{sa}=is\text{id}$ and $\star d_{sa}\Phi=X_{\perp}\Phi$ :

[TABLE]

Before choosing $s$ appropriately, we need additional work (tedious as in [35]) on the term $\Re(\Phi v_{s},G_{s}v_{s})$ . Taking into account boundary terms, and upon defining $B_{\pm 1}:=\eta_{\pm}\Phi$ , we prove in Sec. 5.3.3 that

[TABLE]

with $e_{x}(v)$ defined in (55). The last term in the sum will move to the right-hand side of (46), while the other two need to be controlled with a large $s$ . To achieve this, we prove in Sec. 5.3.3 the following:

Lemma 5.9.

There exists a universal constant $C>0$ such that for all $s\geq C|\Phi|_{C^{1}}$ ,

[TABLE]

In particular, for $s=C|\Phi|_{C^{1}}+1$ , identity (46) becomes

[TABLE]

We now explain how to bound the right-hand side in terms of $\|I_{\Phi}f\|^{2}_{H^{1}(\partial_{+}SM)}$ . Recall that $v_{s}=\Pi_{-}(e^{sw}u)$ . The first claim is that $[\Pi_{-},V]=[\Pi_{-},T]=0$ . The first one is obvious because both operators are diagonal of the fiberwise Fourier decomposition $C^{\infty}(\partial SM)=\bigoplus_{k\in\mathbb{Z}}\ker(id-ikV)$ . That $T$ is also diagonal on this decomposition follows from the fact that $[T,V]=0$ . With this in mind, we have, on $\partial SM$ :

[TABLE]

and since $u|_{\partial_{-}SM}=0$ , $Vu$ and $Tu$ will be controlled by $\|I_{\Phi}f\|_{H^{1}(\partial_{+}SM)}$ . The right hand side of (49) is thus bounded by $C^{\prime}(s^{2}+s|\Phi|_{C^{0}}+1)e^{2sw_{\infty}}\|I_{\Phi}f\|^{2}_{H^{1}(\partial_{+}SM)}$ , where the constant $C^{\prime}$ does not depend on $\Phi$ .

Using this bound and throwing out the first and third terms of the left-hand side of (49), we obtain

[TABLE]

The second term in the left-hand side controls $\|v_{s}\|_{L^{2}}$ directly, and we can write

[TABLE]

with $C^{\prime}$ some constant independent of $\Phi$ . Recalling that $s=C|\Phi|_{C^{1}}+1$ and combining estimates (41), (44), (45) and (50), we arrive at estimate (33), completing the proof of Theorem 5.3.

5.3.3 Remaining ingredients

Pestov identity with boundary term for ray transforms with skew-hermitian pairs

Let $A$ and $\Phi$ a skew-hermitian pair, and define

[TABLE]

We have the following structure equations

[TABLE]

where $\star d_{A}\Phi=X_{\perp}\Phi+\Phi A_{V}-A_{V}\Phi$ , or when the connection $A$ is scalar, $\star d_{A}\Phi=X_{\perp}\Phi$ , where $\kappa(x)$ is the Gaussian curvature. In what follows, we will need to integrate by parts with boundary terms, and using (36), we obtain for $G$ :

[TABLE]

Proof of Theorem 5.7.

We first write a differential identity using the structure equations (51):

[TABLE]

where $G\Phi f:=G(\Phi f)$ . We record this here as

[TABLE]

Now, considering $u$ smooth and supported up the boundary, we write

[TABLE]

We now arrange the four boundary terms using integration by parts in $V$ and the formulas

[TABLE]

First notice that

[TABLE]

We then obtain

[TABLE]

We now simplify, using that $V(A)(x,v)=A(x,v^{\perp})$ and $\mu_{\perp}X+\mu X_{\perp}=T$ ,

[TABLE]

The boundary terms then simplify into

[TABLE]

With this notation, the full Pestov identity takes the form

[TABLE]

To recover [35, Eq. (8)], we take the real part of the equality above, and notice that $(\star F_{A}Vu,u)=-(\star F_{A}u,Vu)$ because $V(\star F_{A})=0$ ; then

[TABLE]

Since the last term is purely imaginary, the real parts of the other terms agree, and upon taking the real part of (53), we obtain

[TABLE]

(Note that the second boundary term is purely real so the $\Re$ is just ornamental)

We finally explain how the index form term $\|GVu\|^{2}-(Vu,\kappa Vu)$ can be rewritten as the sum of a non-negative term and a boundary term. With $\beta_{\text{Ter}}$ as in the statement, and the function $r=r_{\beta}\colon SM\to{\mathbb{R}}$ solving $Xr+r^{2}+\beta\kappa=0$ , we now compute, for any $\psi\in C^{\infty}(SM,{\mathbb{C}}^{n})$

[TABLE]

We simplify

[TABLE]

We arrive at

[TABLE]

and we may rearrange this as

[TABLE]

Plugging this last relation into (54) with $\psi=Vu$ yields (38). ∎

Remaining estimates and lemmata

Proof of equality (42).

We write, using Lemma 5.8

[TABLE]

To rewrite the last term, from the equation $G_{s}(e^{sw}u)=-e^{sw}f$ , note the relation

[TABLE]

Then we have, for $k>0$ ,

[TABLE]

where we used the transport equation in the last line. For $k=0$ ,

[TABLE]

Plugging this back into the equation for $\eta_{+}u_{-1}$ , we get

[TABLE]

We now write

[TABLE]

and similarly

[TABLE]

Using the last two computations, we arrive at (42). ∎

Proof of estimate (45).

The transport equation for $e^{sw}u$ projected onto the harmonic term of degree $-1$ reads:

[TABLE]

For our choice of connection, $a_{-1}=-\eta_{-}w_{0}$ so the left side can be rewritten as

[TABLE]

hence we obtain

[TABLE]

We then rewrite the latter right-hand side in terms of $G_{s}Vv_{s}$ . Notice that

[TABLE]

so

[TABLE]

and thus

[TABLE]

Upon deriving an estimate of the form

[TABLE]

we can write

[TABLE]

and (45) follows. ∎

Proof of (47).

We first need to write an integration by parts for $\mu_{\pm}$ defined in (37). Using integrations by parts (36) we first derive an integration by parts for $X_{\perp}=XV-VX$ : for any $u,w$ smooth on $SM$ ,

[TABLE]

We now compute, using that $\mu_{+}^{*}=-\mu_{-}$

[TABLE]

where we define

[TABLE]

Similarly, for the skew-hermitian connection considered,

[TABLE]

Now, using the fact that

[TABLE]

we compute

[TABLE]

where $p_{1}:=\Re(\Phi(\eta_{-}+sa_{-1})(v_{s})_{-1},(v_{s})_{-2})$ . Upon defining

[TABLE]

we now prove by induction the following claim:

[TABLE]

The case $n=1$ is proved above, and the induction step $(n\implies n+1)$ follows from the calculation

[TABLE]

Putting this equality back into (57) proves the induction. Now since $v_{s}\in H^{1}(SM)$ , we have that $\lim_{n\to\infty}p_{n}=0$ , and thus (47) follows. ∎

Proof of Lemma 5.9.

The term that ultimately controls everything is

[TABLE]

The infinite sum in (48) can then be controlled by

[TABLE]

with $C_{1}$ a universal constant. As for the last term of the left-hand side of (48), we write

[TABLE]

where $C_{2}$ is a universal constant. Lemma 5.9 follows upon taking $C=C_{1}+C_{2}$ . ∎

5.3.4 Conclusion: dealing with the glancing

Consider a function $\rho\in C^{\infty}(M)$ such that it coincides with $M\ni x\mapsto d(x,\partial M)$ in a neighbourhood of $\partial M$ and such that $\rho\geq 0$ and $\partial M=\rho^{-1}(0)$ . Clearly $\nabla\rho(x)=-\nu(x)$ for $x\in\partial M$ . Using $\rho$ , we extend $\nu$ to the interior of $M$ as $\nu(x)=-\nabla\rho(x)$ for $x\in M$ . We let $\mu(x,v):=\langle v,\nu(x)\rangle$ and

[TABLE]

Note that $T$ is now defined on all $SM$ and agrees with the vector field $T$ defined previously on $\partial SM$ . In fact $T$ and $V$ are tangent to every $\partial SM_{\varepsilon}=\{(x,v)\in SM:\;\;x\in\rho^{-1}(\varepsilon)\}$ , where $M_{\varepsilon}=\rho^{-1}(-\infty,\varepsilon]$ . The next lemma for $\tau$ is the key input to deal with the glancing, cf. [39, Lemma 4.1.3], [40, Lemma 3.2.3] and [8, Lemma 5.1].

Lemma 5.10.

The functions $V\tau$ and $T\tau$ are bounded on $SM\setminus\partial_{0}SM$ .

To substantiate the previous claim that the behaviour of $u=u^{f}$ is the same as that of $\tau$ we proceed as follows. We consider a smooth integrating factor $R:SM\to GL(n,{\mathbb{C}})$ such that $XR+\Phi R=0$ . These always exist for any non-trapping manifold with strictly convex boundary. A simple calculation shows that we may write $u$ in terms of $R$ as

[TABLE]

where $\varphi_{t}$ is the geodesic flow of $(M,g)$ . Thus directly from Lemma 5.10 we obtain:

Lemma 5.11.

The functions $Vu$ and $Tu$ are bounded on $SM\setminus\partial_{0}SM$ .

Next we note that all the previous work that we have done assuming $u$ smooth may be summarized as follows:

Theorem 5.12.

Let $(M,g)$ be a simple Riemannian surface with boundary and $\Phi$ a smooth, skew-hermitian matrix field in $M$ . Then for any $f\in C^{\infty}(M)$ , we have the following stability estimate

[TABLE]

where $v$ is any smooth solution of $Xv+\Phi v=-f$ .

Proof of Theorem 5.3 in full generality.

Let $M_{\varepsilon}$ for small $\varepsilon$ be the surface considered above. We let $u:SM\to{\mathbb{C}}^{n}$ be the unique solution to the problem

[TABLE]

The function $v:=u|_{SM_{\varepsilon}}$ is smooth in $SM_{\varepsilon}$ and solves $Xv+\Phi v=-f$ since $u$ does. Hence we may apply Theorem 5.12 in $M_{\varepsilon}$ to obtain

[TABLE]

where we might as well use the constants for $M$ which bound those for $M_{\varepsilon}$ . We now let $\varepsilon\to 0$ ; we clearly have

[TABLE]

and using Lemma 5.11 we see that

[TABLE]

Since

[TABLE]

the theorem is proved.

∎

5.4 Consistency of the posterior mean: proof of Theorem 3.2

We assume $\sigma^{2}=1$ , the general case $0<\sigma^{2}<\infty$ requires only notational changes.

The overall strategy we pursue here, which has also been used in some form in [45, 31, 30, 32], is to show first that the Bayesian algorithm recovers the ‘regression function’ $C_{\Phi}$ consistently in a natural statistical distance function, and to combine this with quantitative stability estimates for the inverse map $C_{\Phi}\mapsto\Phi$ in appropriate metrics. This exploits crucially that the estimated Bayesian regression outputs lie in the (non-linearly constrained) range of the forward map $C_{\Phi}$ , so that the stability estimate applies to them. To make this approach work with ‘unbounded’ Gaussian priors is challenging, and our proofs proceed as follows: We first establish the posterior contraction Theorem 5.13 under general conditions, borrowing from Bayesian nonparametric theory (e.g., [17, Theorem 8.19] or [19, Theorem 7.3.3]), slightly strengthening the usual statement of such theorems to give explicit exponential bounds for the convergence rate to zero of certain posterior probabilities. Since our regression functions $C_{\Phi}$ take values in $SO(n)$ , they are uniformly bounded and the usual Hellinger distance occurring in such contraction theorems is then Lipschitz-equivalent to the standard $L^{2}$ -distance (see Lemma 5.14). Then Lemma 5.16 uses results of [27] to show that the key small ball condition in Theorem 5.13 can be verified for the Gaussian priors from Condition 3.1 even after they have been shrunk towards zero, if the true matrix field $\Phi_{0}$ belongs to the RKHS $\mathcal{H}$ . Next, Lemma 5.17 exploits fine properties [4, 16] of infinite-dimensional Gaussian measures to show that such ‘shrunk’ priors charge ‘sufficiently regular’ matrix fields (effectively $C^{\beta}$ -balls) with probability close enough to one that the posterior distributions inherits these regularity properties. This is crucial to apply the ‘forward’ estimate Theorem 2.2 and the ‘stability’ estimate (9) in the proof of Theorem 5.19 – effectively the specific structure of our inverse problem enters only in this theorem and only through these two estimates. Finally, the exponential convergence to zero of the order $e^{-(C+3)N\delta_{N}^{2}}$ obtained in Theorem 5.19 permits a ‘quantitative uniform integrability argument’ in Section 5.4.5 to deduce convergence of the whole posterior (Bochner-) mean towards the true matrix field $\Phi_{0}$ .

Let us mention that in the recent contributions [1, 20] (written after the first version of this manuscript was completed), the general proof template developed here has already been used effectively in two different non-linear inverse problems (arising with elliptic PDEs), see also Remark 3.6.

5.4.1 A general contraction theorem

Consider a collection $\mathcal{P}$ of probability density functions on some measurable space $(\mathcal{X},\mathcal{A})$ with respect to a dominating measure $\mu$ , specifically in our measurement model (3) we take

[TABLE]

where $\mathcal{X}$ is equipped with its natural product Borel- $\sigma$ algebra $\mathcal{A}$ , where $d\mu=dy\times d\lambda$ with $dy$ equal to Lebesgue measure on $\mathbb{R}^{n\times n}$ and $\lambda$ given in (5). By the Gaussianity of the $\varepsilon_{1,j,k}$ ’s these probability densities are of the form

[TABLE]

Since the map $(\Phi,y,(x,v))\mapsto p_{\Phi}(y,(x,v))$ is jointly Borel-measurable from $C(M)\times\mathcal{X}$ to $\mathbb{R}$ (using (8) and that point-evaluation is $\|\cdot\|_{\infty}$ -continuous), the posterior distribution (11) exists by standard arguments ([17], p.7) and has the desired form. In the proof of the following theorem we show in particular that the marginal density $\int\prod_{i=1}^{N}p_{\Phi}(Y_{i},(X_{i},V_{i}))d\Pi(\Phi)$ is positive on events of $P_{\Phi_{0}}^{N}$ -probability approaching one, so that (11) is well-defined also in the frequentist setting where $D_{N}\sim P_{\Phi_{0}}^{N}$ . We also define the Hellinger distance $h$ on such densities by

[TABLE]

Denote by $N(F,h,\delta)$ the minimal number of Hellinger-balls of radius $\delta$ required to cover a set $F$ of $\mu$ -densities on $\mathcal{X}$ . We then have the following

Theorem 5.13.

Consider a prior for $\Phi$ arising from a sequence $\Pi=\Pi_{N}$ of Borel probability measures on $\mathcal{F}\subseteq C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ and let $\Pi(\cdot|(Y_{i},(X_{i},V_{i}))_{i=1}^{N})$ be the posterior distribution arising from i.i.d. observations $(Y_{i},(X_{i},V_{i}))_{i=1}^{N}|\Phi\sim P_{\Phi}^{N}$ . Let $\Phi_{0}\in\mathcal{F}$ , let $\delta_{N}\to 0$ be a sequence such $\sqrt{N}\delta_{N}\to\infty$ as $N\to\infty$ , and define sets

[TABLE]

Suppose for some constant $C>0$ the prior $\Pi$ satisfies for all $N$ large enough

[TABLE]

and that for some sequence $\mathcal{F}_{N}\subset\mathcal{F}$ of approximating sets for which

[TABLE]

we have the complexity bound

[TABLE]

for some fixed constant $c>0$ . Then for some large enough constant $m=m(C,c)>0$

[TABLE]

Proof.

Recall from (4) that we write $D_{N}=(Y_{i},(X_{i},V_{i}))_{i=1}^{N}$ . The proof proceeds as in the proof of [19, Theorems 7.3.1 and 7.3.3]: We first use [19, Lemma 7.3.2] and the hypothesis (61) to deduce that the events

[TABLE]

satisfy $P_{\Phi_{0}}^{N}(A_{N})\to 1$ as $N\to\infty$ . Moreover using (63) and [19, Theorem 7.1.4] with choices $\varepsilon_{0}=m^{\prime}\delta_{N}$ , any $m^{\prime}<m$ , and $\log N(\varepsilon)=cN\delta_{N}^{2}$ constant in $\varepsilon>\varepsilon_{0}$ , we deduce that for every $k>1$ there exists $m^{\prime},m$ large enough such that we can find ‘tests’ (random indicator functions) $\Psi_{N}=\Psi_{N}(D_{N})$ for which

[TABLE]

Now let us write

[TABLE]

for the event whose posterior probability we want to bound. Then by (11) and as $N\to\infty$ ,

[TABLE]

By Markov’s inequality, decomposing

[TABLE]

and using Fubini’s theorem as well as

[TABLE]

we further bound the last probability as

[TABLE]

where we have used (62) and (66) with $k$ and then $m$ large enough. ∎

The ‘information-theoretic distance’ $h$ arises naturally in such posterior contraction theorems, see [17]. The following lemma, which adapts a result due to Birgé [2] to the setting of $SO(n)$ -valued functions, shows that the Hellinger distance is Lipschitz equivalent to the standard $L^{2}$ -metric

[TABLE]

Lemma 5.14.

For $\Phi\in C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ , let $C_{\Phi}\colon\partial_{+}SM\to SO(n)$ be its non-Abelian $X$ -ray transform. Then there exist positive constants $c_{0}=c_{0}(n),c_{1}=c_{1}(n)$ such that

[TABLE]

Proof.

Write

[TABLE]

for the ‘Hellinger affinity’. By (58) and using the standard formula for the moment generating function of $N(0,1)$ -variables, the quantity $\rho(p_{\Phi},p_{\Psi})$ equals

[TABLE]

By Jensen’s inequality the last integral is greater than or equal to $\exp\{-\|C_{\Phi}-C_{\Psi}\|_{L^{2}}^{2}/8\}$ and using standard inequalities for $1-e^{-z},z>0,$ the right hand side of (68) follows. Next we notice that since $C_{\Phi}(x,v),C_{\Psi}(x,v)\in SO(n)$ , their matrix entries are all bounded by one and we hence have $\left|C_{\Phi}(x,v)-C_{\Psi}(x,v)\right|_{F}^{2}/8\leq B^{2}$ for some constant $B=B(n)$ . We can thus proceed exactly as in the proof of [2, Proposition 1] (or see Lemma 22 in [20]) to also deduce the left hand side inequality in (68). ∎

5.4.2 Verification of the prior mass condition

We now verify condition (61) in the last theorem for an explicit constant $C>0$ and the Gaussian prior from Theorem 3.2. To do this we first show that one can reduce to checking small ball conditions for $\|\cdot\|_{L^{2}(M)}$ -norms on the level of the original matrix parameter $\Phi$ .

Lemma 5.15.

For $\Phi_{0}\in C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ and $\kappa>0$ define

[TABLE]

and let $B_{N},\Pi,\delta_{N},$ be as in Theorem 5.13. Then for some $\kappa=\kappa(M,n)$ large enough we have $\mathcal{B}_{N}(\kappa)\subset B_{N}$ and thus in particular, for every $N\in\mathbb{N}$ ,

[TABLE]

Proof.

From (3) with $\Phi=\Phi_{0}$ and (58) we have

[TABLE]

Therefore, since $E^{1}_{\varepsilon}\varepsilon_{1,j,k}=0$ and $\lambda$ is the unit volume measure on $\partial_{+}SM$ ,

[TABLE]

where we have used the forward estimate (7). Thus if $\kappa\geq 2/C_{1}^{2}$ the first inequality defining $B_{N}$ is verified for $\Phi\in\mathcal{B}_{N}(\kappa)$ . To verify the second, note that all $C_{\Phi}(x,v)\in SO(n)$ are bounded in $\|\cdot\|_{L^{\infty}(\partial_{+}SM)}$ -norm by some fixed constant $B=B(n)$ . Thus

[TABLE]

for some constant $c(n)>0$ , where we have also used that $\varepsilon_{j,k}\sim^{i.i.d.}N(0,1)$ implies, for $(x,v)\in\partial_{+}SM$ fixed,

[TABLE]

and again (7), so that the overall result follows from appropriate choice of $\kappa>0$ ∎

We now turn to lower bound the small ball probabilities $\Pi(\mathcal{B}_{N}(\kappa))$ for the prior $\Pi$ featuring in Theorem 3.2 where for the given $\alpha$ we will choose

[TABLE]

Note that $\sqrt{N}\delta_{N}$ precisely equals the rescaling of the prior in (13). Let us recall the base RKHS $\mathcal{H}$ from Condition 3.1.

Lemma 5.16.

Let $\Pi=\times_{j=1}^{\bar{n}}\Pi_{B}$ be the prior for $\Phi$ from Theorem 3.2 with $\alpha>\beta+1,\beta>0$ , assume $\Phi_{0}\in\mathcal{H}$ and choose $\delta_{N}$ as in (70). Let $\mathcal{B}_{N}(\kappa)$ be as in Lemma 5.15. Then for every $\kappa>0$ there exists a constant $C^{\prime}=C^{\prime}(\kappa,\alpha,\|\Phi_{0}\|_{\mathcal{H}},n,M)$ such that for every $N\in\mathbb{N}$ ,

[TABLE]

In particular, for $B_{N}$ as in (60) in Theorem 5.13, there exists a finite constant

[TABLE]

such that for every $N\in\mathbb{N}$ ,

[TABLE]

Proof.

Since $\|\Phi-\Phi_{0}\|_{L^{2}(M)}\leq\bar{n}\max_{j}\|B_{j}-B_{0,j}\|_{L^{2}(M)}$ , to prove the first inequality it suffices to lower bound, by independence of the $B_{j}$ ’s,

[TABLE]

The sets $\{b:\|b\|_{L^{2}(M)}\leq c\},c>0,$ are convex and symmetric, hence by [19, Corollary 2.6.18] we have for every $j$ fixed,

[TABLE]

where we have used that

[TABLE]

in view of (13), (70), (and where we refer to [19, Exercise 2.6.5] or [17, Lemma I.16] for standard preservation properties of RKHS under linear transformations).

We next bound the centred probability which by (13), (70) equals

[TABLE]

By Condition 3.1 the RKHS of the Gaussian law of $f^{\prime}$ in $C(M)$ is continuously imbedded into $H^{\alpha}(M)$ . The unit ball $U$ of this space satisfies the bound

[TABLE]

for its $L^{2}(M)$ -covering numbers: Indeed, since the simple surface $M$ is diffeo-morphic to a disk, we can extend all functions $f$ in $H^{\alpha}(M)$ to elements $f_{e}$ of the Sobolev space $H^{\alpha}(I_{2})$ on the $2$ -torus $I_{2}=(0,1]^{2}\supset M$ , with Sobolev-norm increased by at most a fixed multiplicative constant (Ch.4 in [42]). An appropriate bound for the $L^{2}(I_{2})$ -covering numbers of $\{f_{e}:f\in U\}$ is then provided in [19, (4.184)], which in turn (since $\|f-f^{\prime}\|_{L^{2}(M)}\leq\|f_{e}-f_{e}^{\prime}\|_{L^{2}(I_{2})}$ for all $f,f^{\prime}\in L^{2}(M)$ ) also bounds the $L^{2}(M)$ -covering numbers of $U$ as required.

To proceed, we can now use (70) and [27, Theorem 1.2] (with the value of $\alpha$ there equal to our $2/\alpha$ ) to lower bound the last small ball probability by

[TABLE]

for constants $c=c(\alpha),c_{0}=c_{0}(\kappa,n,\alpha)$ and since $\alpha>1$ . Combining what precedes proves the first inequality of the lemma with

[TABLE]

The second inequality (71) now follows from the first and Lemma 5.15. ∎

We note that the proof in fact shows that the constant $C$ depends only on upper bounds for $\|\Phi_{0}\|_{\mathcal{H}}$ .

5.4.3 Excess mass and complexity condition

Having determined the constant $C$ in (61) for the Gaussian prior in Theorem 3.2, we now turn to verifying the remaining conditions (62) and (63) in Theorem 5.13 for a suitable choice of $\mathcal{F}_{N}$ that will provide sufficient regularity of the posterior distribution to combine it with our stability estimates for the map $\Phi\mapsto C_{\Phi}$ .

Lemma 5.17.

Let $\Pi$ be the prior from Theorem 3.2 with $\alpha>\beta+1,\beta>0$ , let $\delta_{N}$ be as in (70) and assume $N\delta^{2}_{N}\geq 1$ . For $m>0$ define subsets of $C(M,{\mathfrak{s}}{\mathfrak{o}}(n))$ as

[TABLE]

a) Then for every $K>0$ we can choose $m$ large enough such that

[TABLE]

b) Moreover for some $c=(m,\alpha,n,vol(M))$ we have

[TABLE]

Proof.

a) Recalling (13), (70), we can identify a prior draw $\Phi$ with the vector field

[TABLE]

We denote by $\Pi^{\prime}_{\bar{n}}$ the product measure describing the law of the centred Gaussian random variable $(f^{\prime}_{1},\dots,f^{\prime}_{\bar{n}})$ in the Banach space $\times_{j=1}^{\bar{n}}C(M)$ .

Write next $\mathcal{F}_{N}=\mathcal{F}_{N,1}\cap\mathcal{F}_{N,2}$ where, with $f^{\prime}_{i,\cdot}$ corresponding to $\Phi_{i}$ , $i=1,2,$

[TABLE]

so that it suffices to bound the prior probabilities of the complements of $\mathcal{F}_{N,1},\mathcal{F}_{N,2}$ .

We first turn to $\mathcal{F}_{N,2}$ . By Condition 3.1 the vector field $(f^{\prime}_{1},\dots,f^{\prime}_{\bar{n}})$ defines a Gaussian Borel random variable in a separable linear subspace $\mathcal{S}$ of $\times_{j=1}^{\bar{n}}C^{\beta}(M)$ . By the Hahn-Banach theorem its $\times_{j=1}^{\bar{n}}C^{\beta}(M)$ -norm can then be represented as a countable supremum

[TABLE]

of bounded linear real functionals $T=(t_{m}:m\in\mathbb{N})$ defined on $(\mathcal{S},\|\cdot\|_{\times_{j=1}^{\bar{n}}C^{\beta}(M)})$ . We then apply a version of Fernique’s theorem [16], concretely [19, Theorem 2.1.20], to the centred Gaussian process $(X(t):=t(f^{\prime}_{1},\dots,f^{\prime}_{\bar{n}}):t\in T)$ to deduce that for some fixed constant $D>0$ ,

[TABLE]

and then also, for $m=m(D)$ large enough and since $N\delta_{N}^{2}\geq 1$ ,

[TABLE]

for $k$ a fixed constant, which can be made less than $e^{-KN\delta_{N}^{2}}/2$ for any $K$ provided $m=m(K,k,D)$ is chosen large enough.

It remains to show that $\Pi(\mathcal{F}_{N,1})\geq 1-\frac{1}{2}\exp\{-KN\delta_{N}^{2}\}$ for $m$ large enough. Using the continuous imbedding $\mathcal{H}\subset H^{\alpha}(M)$ with imbedding constant $c^{\prime}$ (cf. Condition 3.1), it suffices to lower bound

[TABLE]

where $O_{\mathcal{H}}$ is the unit ball in $\times_{j=1}^{\bar{n}}\mathcal{H}$ and where we define

[TABLE]

By Borell’s [4] isoperimetric inequality (see [19, Theorem 2.6.12]) the last probability is bounded below by

[TABLE]

where $\Phi=\Pr(Z\leq\cdot)$ is the cumulative distribution function of a $N(0,1)$ random variable $Z$ . By the same arguments as those leading to (73) above, we have

[TABLE]

and using the basic inequality $\Phi^{-1}(u)\geq-\sqrt{2\log_{-}u},0<u<1,$ (see [17, Lemma K.6]) and monotonicity of $\Phi$ we can further lower bound (75) by

[TABLE]

Now given $K$ define

[TABLE]

which by the previous inequality for $\Phi^{-1}$ can be made less than or equal to $(\frac{m}{c^{\prime}}-c_{2}\sqrt{2})\sqrt{N}\delta_{N}$ whenever $m=m(K,c_{2},c^{\prime})$ is large enough. Conclude that the penultimate display is lower bounded by

[TABLE]

completing the proof of Part a).

b) To prove Part b), note first that to construct a $\delta_{N}$ -covering of $\mathcal{F}_{N}$ in $\|\cdot\|_{L^{2}(M)}$ -distance it suffices, by definition of $\mathcal{F}_{N}$ , to construct such a covering for a $H^{\alpha}(M)$ -ball of radius $m$ , so that (72) and the definition of $\delta_{N}$ give (with $A^{\prime}>0$ )

[TABLE]

Lemma 5.14 and (7) imply that such a covering induces a $(C_{1}\sqrt{c_{1}})\delta_{N}$ -covering of $\mathcal{F}_{N}$ in the Hellinger distance $h$ of log-cardinality at most $bN\delta_{N}^{2}$ . Since $\|\cdot\|_{L^{2}(M)}$ is a norm and hence homogeneous, we can increase the constant from $b$ to $c=c(b,c_{1},C_{1},\alpha)$ in (76) and obtain a $\delta_{n}$ -covering for $h$ . The desired inequality in Part b) follows. ∎

*Remark 5.18**.*

We note that the introduction of the set $\mathcal{F}_{N,1}$ and the use of Borell’s inequality in the previous Lemma could be avoided if one wishes to prove Theorem 3.2 only for any $\eta>0$ (in this case a minor adaptation of Theorem 5.13 and of (76) can be shown to give a slightly worse rate $\delta_{N}^{\prime}=N^{-\beta/(2\alpha+2)}$ in (77) below). We give this argument however to obtain our sharper bound for $\eta$ in (81).

5.4.4 Final contraction theorem

We now put everything together to establish a posterior contraction theorem for $\Phi$ and subsequently deduce Theorem 3.2 from it.

Theorem 5.19.

Under the hypotheses of Theorem 3.2, with $\alpha>\beta+1,\beta>0,\delta_{N}=N^{-\alpha/(2\alpha+2)}$ and $C$ from (71), we have for all $m^{\prime}$ large enough that

[TABLE]

as $N\to\infty$ . Moreover, if $\beta>2$ then we have for every integer $\bar{\beta}$ such that $1<\bar{\beta}<\beta$ and all $m^{\prime\prime}$ large enough,

[TABLE]

*Remark 5.20**.*

The constraint $\beta>2$ in the second limit in Theorem 5.19 is only required to allow space for an integer $\bar{\beta}\in(1,\beta)$ in the following proof, when combining the interpolation inequality (79) with Theorem 2.2 for $k=\bar{\beta}\in\mathbb{N}$ . If a version of Theorem 2.2 were established for non-integer $k$ then $\beta>1$ and real $\bar{\beta}\in(1,\beta)$ would be permitted in Theorem 5.19 (and then also in Theorem 3.2).

Proof.

From Lemmata 5.16 and 5.17 with $K=2C+6$ and Theorem 5.13 we deduce for $m=m(C)$ large enough, and as $N\to\infty$

[TABLE]

Applying Lemma 5.14 gives the first limit (77) with $m^{\prime}=(1+\sqrt{c_{0}})m$ .

To prove the second limit we will apply the stability estimate Theorem 2.1 in the form (9) with $\Psi=\Phi_{0}$ . By hypothesis we have $\|\Phi_{0}\|_{C^{1}(M)}\lesssim\|\Phi_{0}\|_{C^{\alpha}(M)}<\infty$ ; as a consequence for all $\Phi$ contained in the event in (77) with $\beta>2$ , the constants $c(\Phi,\Phi_{0})$ from (6) are uniformly bounded by a fixed constant that depends on $m^{\prime},\|\Phi_{0}\|_{C^{1}(M)}$ and hence for those $\Phi$ ’s

[TABLE]

To proceed we will need a standard interpolation result for Sobolev spaces on the manifold $\partial_{+}SM$ to the effect that

[TABLE]

for all $W\in H^{k}(\partial_{+}SM)$ and any $k>1$ . [For real-valued functions this can be proved using standard arguments from Ch.4 in [42] and these results extend to matrix-fields in a straightforward way.] Moreover we will use the basic inequality

[TABLE]

for all $\Phi\in C^{\beta}(M)$ . Now Theorem 2.2 implies that for all $\Phi$ ’s in the event in (77) the corresponding $\|C_{\Phi}\|_{H^{\bar{\beta}}(\partial_{+}SM)}$ ’s are uniformly bounded by a fixed constant that depends on $m^{\prime},\beta,\bar{\beta}$ only. Likewise

[TABLE]

in view of Theorem 2.2 and since $\Phi_{0}\in C^{\alpha}$ for $\alpha>\bar{\beta}$ by hypothesis. Hence for such $\Phi$ ’s the combination of (78) and (79) with $W=C_{\Phi}-C_{\Phi_{0}},k=\bar{\beta}$ gives

[TABLE]

The second conclusion of Theorem 5.19 now follows from the preceding inequalities and (77).

5.4.5 Completion of the proof of Theorem 3.2

The last step is to show that the posterior contraction rate in the second limit of Theorem 5.19 carries over to the posterior mean $E^{\Pi}[\Phi|D_{N}]$ . For any integer $\bar{\beta}\in(1,\beta)$ and every

[TABLE]

we have as $N\to\infty$

[TABLE]

Then by the inequalities of Jensen and Cauchy-Schwarz

[TABLE]

and it suffices to show that the second summand is stochastically $O(\eta_{N})$ as $N\to\infty$ .

Arguing as in the proof of Theorem 5.13 and using Lemma 5.16 implies that the sets $A_{N}$ from (65) with $C$ from (71) satisfy $P_{\Phi_{0}}^{N}(A_{N})\to 1$ as $N\to\infty$ . Now Theorem 5.19, (11) and Markov’s inequality imply

[TABLE]

where we have also used Fubini’s theorem, (67), and that the Gaussian measure $\Pi$ is supported in $L^{2}(M)$ and hence integrates $\|\Phi\|^{2}_{L^{2}}$ to a finite constant (see, e.g., [19, Exercise 2.1.5]). ∎

\ack

We would like to thank the referee for helpful remarks and suggestions. We are further very grateful to Bill Lionheart for having introduced us to polarimetric neutron tomography and its connection to the non-abelian X-ray transform. FM was supported by NSF grant DMS-1814104 and a UC Hellman Fellowship. RN was supported by the European Research Council under ERC grant No. 647812 (UQMSI). GPP thanks the University of California at Santa Cruz and the University of Washington for hospitality while this work was in progress. GPP was supported by the Leverhulme trust and EPSRC grant EP/R001898/1.

Bibliography45

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Abraham, K.; Nickl, R. On statistical Caldéron problems. Math. Stat. Learn. (2020), to appear
2[2] Birgé, L. Model selection in Gaussian regression with random design. Bernoulli 10 (2004), no. 6, 1039-1051.
3[3] Briol, F.X.; Oates, C.; Girolami, M.; Osborne, M.; Sejdinovic, D. Probabilistic integration: A role in statistical computation? Statist. Sci. 34 (2019), no. 1, 1-22.
4[4] Borell, C. The Brunn-Minkowski inequality in Gauss space. Invent. Math. 30 (1975), no. 2, 207-216.
5[5] Castillo, I.; Nickl, R. Nonparametric Bernstein-von Mises theorems in Gaussian white noise. Ann. Statist. 41 (2013), no. 4, 1999–2028.
6[6] Castillo, I.; Nickl, R. On the Bernstein-von Mises phenomenon for non-parametric Bayes procedures. Ann. Statist. 42 2014, no. 5, 1941–1969.
7[7] Cotter, S.L.; Roberts, G.O.; Stuart, A.M.; White, D. MCMC methods for functions: modifying old algorithms to make them faster. Statist. Sci. 28 (2013), no. 3, 424–446.
8[8] Dairbekov, N.S.; Paternain, G.P.; Stefanov, P; Uhlmann, G. The boundary rigidity problem in the presence of a magnetic field. Adv. Math. 216 (2007), no. 2, 535–609.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Consistent Inversion of Noisy Non-Abelian X-Ray Transforms

Abstract

Contents

1 Introduction

1.1 Non-Abelian XXX-ray transforms

Theorem 1.1**.**

Theorem 1.2**.**

1.2 Polarimetric neutron tomography (PNT)

1.3 The statistical observation scheme

1.4 Some geometric background and basic notation

2 Theoretical results for the deterministic inverse problem

Theorem 2.1**.**

Theorem 2.2**.**

Corollary 2.3**.**

3 Bayesian inversion of non-Abelian XXX-ray transforms

3.1 Main results

Condition 3.1*.*

Theorem 3.2**.**

3.2 Remarks and discussion

Remark 3.3*.*

Remark 3.4*.*

Remark 3.5*.*

Remark 3.6*.*

Remark 3.7*.*

4 Implementation of the algorithm

4.1 Numerical domain and forward operator

4.2 Statistical estimation through MCMC

5 Proofs

5.1 Geometric preliminaries

Lemma 5.1**.**

Proof of Lemma 5.1.

5.2 Forward estimates - proof of Theorem 2.2

Lemma 5.2** (Work-horse lemma).**

Proof.

5.2.1 Proof of Theorem 2.2 assuming Φ\PhiΦ and Ψ\PsiΨ with support in the interior of MMM

5.2.2 Proof of Theorem 2.2 Φ\PhiΦ and Ψ\PsiΨ supported up to ∂M\partial M∂M

5.3 Stability estimate - proof of Theorem 2.1

5.3.1 Setting, main results and proofs of Theorem 2.1 and Corollary 2.3

Theorem 5.3**.**

Remark 5.4* (Dependence of C1,C2C_{1},C_{2}C1​,C2​).*

Lemma 5.5** (Pseudo-linearization).**

Proof of Lemma 5.5.

Proof of Theorem 2.1.

Proof of Corollary 2.3.

5.3.2 Proof of Theorem 5.3 - Main outline

Definition 5.6**.**

Theorem 5.7**.**

Lemma 5.8**.**

Proof.

Lemma 5.9**.**

5.3.3 Remaining ingredients

Proof of Theorem 5.7.

Proof of equality (42).

Proof of estimate (45).

Proof of (47).

Proof of Lemma 5.9.

5.3.4 Conclusion: dealing with the glancing

Lemma 5.10**.**

Lemma 5.11**.**

Theorem 5.12**.**

Proof of Theorem 5.3 in full generality.

5.4 Consistency of the posterior mean: proof of Theorem 3.2

5.4.1 A general contraction theorem

Theorem 5.13**.**

Proof.

Lemma 5.14**.**

Proof.

5.4.2 Verification of the prior mass condition

Lemma 5.15**.**

Proof.

Lemma 5.16**.**

Proof.

5.4.3 Excess mass and complexity condition

Lemma 5.17**.**

1.1 Non-Abelian $X$ -ray transforms

Theorem 1.1.

Theorem 1.2.

Theorem 2.1.

Theorem 2.2.

Corollary 2.3.

3 Bayesian inversion of non-Abelian $X$ -ray transforms

*Condition 3.1**.*

Theorem 3.2.

*Remark 3.3**.*

*Remark 3.4**.*

*Remark 3.5**.*

*Remark 3.6**.*

*Remark 3.7**.*

Lemma 5.1.

Lemma 5.2 (Work-horse lemma).

5.2.1 Proof of Theorem 2.2 assuming $\Phi$ and $\Psi$ with support in the interior of $M$

5.2.2 Proof of Theorem 2.2 $\Phi$ and $\Psi$ supported up to $\partial M$

Theorem 5.3.

*Remark 5.4** (Dependence of $C_{1},C_{2}$ ).*

Lemma 5.5 (Pseudo-linearization).

Definition 5.6.

Theorem 5.7.

Lemma 5.8.

Lemma 5.9.

Lemma 5.10.

Lemma 5.11.

Theorem 5.12.

Theorem 5.13.

Lemma 5.14.

Lemma 5.15.

Lemma 5.16.

Lemma 5.17.

*Remark 5.18**.*

Theorem 5.19.

*Remark 5.20**.*