Modelling columnarity of pyramidal cells in the human cerebral cortex

Andreas D. Christoffersen; Jesper M{\o}ller; Heidi S. Christensen

arXiv:1908.05065·stat.ME·November 25, 2020

Modelling columnarity of pyramidal cells in the human cerebral cortex

Andreas D. Christoffersen, Jesper M{\o}ller, Heidi S. Christensen

PDF

TL;DR

This paper introduces a hierarchical anisotropic point process model for the spatial distribution of pyramidal cells in the human cerebral cortex, combining cylindrical clustering and interaction effects, and fits it to biological data.

Contribution

It proposes a novel hierarchical point process model capturing anisotropy and cell interactions, fitting it to pyramidal cell data and relating it to neuroscience hypotheses.

Findings

01

The final model effectively captures both repulsion and attraction among cells.

02

Hierarchical modeling improves fit over simpler models.

03

The model relates to the minicolumn hypothesis in neuroscience.

Abstract

For modelling the location of pyramidal cells in the human cerebral cortex we suggest a hierarchical point process in $R^{3}$ that exhibits anisotropy in the form of cylinders extending along the $z$ -axis. The model consists first of a generalised shot noise Cox process for the $x y$ -coordinates, providing cylindrical clusters, and next of a Markov random field model for the $z$ -coordinates conditioned on the $x y$ -coordinates, providing either repulsion, aggregation, or both within specified areas of interaction. Several cases of these hierarchical point processes are fitted to two pyramidal cell datasets, and of these a final model allowing for both repulsion and attraction between the points seem adequate. We discuss how the final model relates to the so-called minicolumn hypothesis in neuroscience.

Tables5

Table 1. Table 1: Minimum contrast estimates of the degenerate PLCPP.

	$\hat{κ}$	$\hat{σ}$	$\hat{α a}$
L3	0.027	2.86	0.36
L5	0.0085	4.58	0.95

Table 2. Table 2: For each dataset L3 and L5 , minimum contrast estimates for the parameters of our final model for X x y subscript 𝑋 𝑥 𝑦 X_{xy} (the DLCPP model in Section 5.1 ).

	$\hat{κ}$	$\hat{σ}$	$\hat{α a}$
L3	0.0040	5.45	2.42
L5	0.0021	6.53	3.87

Table 3. Table 3: Specific choices of the parameters γ 1 , γ 2 , θ 1 , θ 2 subscript 𝛾 1 subscript 𝛾 2 subscript 𝜃 1 subscript 𝜃 2 \gamma_{1},\gamma_{2},\theta_{1},\theta_{2} and the interaction regions B 1 ( ⋅ ; θ 1 ) , B 2 ( ⋅ ; θ 2 ) subscript 𝐵 1 ⋅ subscript 𝜃 1 subscript 𝐵 2 ⋅ subscript 𝜃 2 B_{1}(\cdot;\theta_{1}),B_{2}(\cdot;\theta_{2}) for five models given by the density ( 2 ). For each model, a hard core parameter h ≥ 0 ℎ 0 h\geq 0 is included. Apart from the specified restrictions, it is required for models 2–5 that B 1 ( ⋅ ; θ 1 ) ⊈ b ( ⋅ ; h ) not-subset-of-or-equals subscript 𝐵 1 ⋅ subscript 𝜃 1 𝑏 ⋅ ℎ B_{1}(\cdot;\theta_{1})\not\subseteq b(\cdot;h) (for model 2 this means that r > h 𝑟 ℎ r>h as already indicated) and in addition for model 5 that B 2 ( ⋅ ; θ 2 ) ⊈ b ( ⋅ ; h ) not-subset-of-or-equals subscript 𝐵 2 ⋅ subscript 𝜃 2 𝑏 ⋅ ℎ B_{2}(\cdot;\theta_{2})\not\subseteq b(\cdot;h) where θ 2 = ( r 2 , t 2 ) subscript 𝜃 2 subscript 𝑟 2 subscript 𝑡 2 \theta_{2}=(r_{2},t_{2}) with r 1 ≥ r 2 > 0 subscript 𝑟 1 subscript 𝑟 2 0 r_{1}\geq r_{2}>0 and t 2 > t 1 subscript 𝑡 2 subscript 𝑡 1 t_{2}>t_{1} .

Model	$γ_{1}$	$γ_{2}$	$B_{1} (\cdot; θ_{1})$	$B_{2} (\cdot; θ_{2})$	$θ_{1}$
1	1	1	-	-	-
2	$> 0$	1	$b (\cdot; r)$	-	$r > h$
3	$> 0$	1	$c (\cdot; r, t)$	-	$r, t > 0$
4	$> 0$	1	$c (\cdot; r, t) \ d (\cdot; r, t)$	-	$r, t > 0$
5	$> 0$	$> 0$	$c (\cdot; r_{1}, t_{1})$	$c (\cdot; r_{2}, t_{2}) ∖ c (\cdot; r_{1}, t_{1})$	$r_{1}, t_{1} > 0$

Table 4. Table 4: Pseudo likelihood estimates of our final model (model 5 from Table 3 in Section 5.2 ) for the datasets L3 and L5 .

	$\hat{γ_{1}}$	$\hat{γ_{2}}$	$\hat{h}$	${\hat{r}}_{1}$	${\hat{t}}_{1}$	${\hat{r}}_{2}$	${\hat{t}}_{2}$
L3	0.41	1.78	6.25	20	11.5	11	35.5
L5	0.51	1.68	6.77	24.25	15.5	14.75	37.25

Table 5. Table A.1: Parameter values, their estimate and standard deviation from the simulation study.

	$γ_{1}$	$γ_{2}$	$h$	$r_{1}$	$t_{1}$	$r_{2}$	$t_{2}$
True value	0.41	1.78	6.25	20	11.5	11	35.5
Mean	0.40	1.82	6.73	19.96	11.56	11.06	35.18
Standard deviation	0.048	0.081	0.447	0.608	0.292	0.618	1.357

Equations37

X_{x y} = (ξ, η) \in Φ ⋃ P_{x y} (X_{(ξ, η)} \cap W) .

X_{x y} = (ξ, η) \in Φ ⋃ P_{x y} (X_{(ξ, η)} \cap W) .

E [n (Y_{B_{1}}) \dots n (Y_{B_{k}})] = \int_{B_{1}} \dots \int_{B_{k}} λ^{(k)} (x_{1}, \dots, x_{k}) d x_{1} \dots d x_{k}

E [n (Y_{B_{1}}) \dots n (Y_{B_{k}})] = \int_{B_{1}} \dots \int_{B_{k}} λ^{(k)} (x_{1}, \dots, x_{k}) d x_{1} \dots d x_{k}

g (x_{1}, x_{2}) = \frac{λ ^{(2)} ( x _{1} , x _{2} )}{λ ( x _{1} ) λ ( x _{2} )}, x_{1}, x_{2} \in R^{d} .

g (x_{1}, x_{2}) = \frac{λ ^{(2)} ( x _{1} , x _{2} )}{λ ( x _{1} ) λ ( x _{2} )}, x_{1}, x_{2} \in R^{d} .

K (B) = \int_{B} g (x) d x,

K (B) = \int_{B} g (x) d x,

\hat{θ} = argmin_{θ} \int_{r_{min}}^{r_{max}} T (θ, r)^{q} - \hat{T} (r)^{q}^{p} d r,

\hat{θ} = argmin_{θ} \int_{r_{min}}^{r_{max}} T (θ, r)^{q} - \hat{T} (r)^{q}^{p} d r,

\lambda^{(k)}(u_{1},\ldots,u_{k})=\text{det}[C](u_{1},\ldots,u_{k})\qquad\mbox{for $k=1,2,\ldots$, $u_{1},\ldots,u_{k}\in\mathbb{R}^{2}$},

\lambda^{(k)}(u_{1},\ldots,u_{k})=\text{det}[C](u_{1},\ldots,u_{k})\qquad\mbox{for $k=1,2,\ldots$, $u_{1},\ldots,u_{k}\in\mathbb{R}^{2}$},

f ((z_{i})_{i = 1}^{n} ∣

f ((z_{i})_{i = 1}^{n} ∣

\times I (∥ (x_{i}, y_{i}, z_{i}) - (x_{j}, y_{j}, z_{j}) ∥ > h for 1 \leq i < j \leq n),

s_{B_{k}, θ_{k}} ((z_{i})_{i = 1}^{n} ∣ (x_{i}, y_{i})_{i = 1}^{n}) = 1 \leq i < j \leq n \sum I ((x_{i}, y_{i}, z_{i}) \in B_{k} (x_{j}, y_{j}, z_{j}; θ_{k})),

s_{B_{k}, θ_{k}} ((z_{i})_{i = 1}^{n} ∣ (x_{i}, y_{i})_{i = 1}^{n}) = 1 \leq i < j \leq n \sum I ((x_{i}, y_{i}, z_{i}) \in B_{k} (x_{j}, y_{j}, z_{j}; θ_{k})),

(x_{i}, y_{i}, z_{i}) \in B_{k} (x_{j}, y_{j}, z_{j}; θ_{k})

(x_{i}, y_{i}, z_{i}) \in B_{k} (x_{j}, y_{j}, z_{j}; θ_{k})

B_{1} (x, y, z; θ_{1}) \cap B_{2} (x, y, z; θ_{2}) = \emptyset

B_{1} (x, y, z; θ_{1}) \cap B_{2} (x, y, z; θ_{2}) = \emptyset

f ((z_{i})_{i = 1}^{n} ∣ (x_{i}, y_{i})_{i = 1}^{n}) \propto 1 \leq i < j \leq n \prod

f ((z_{i})_{i = 1}^{n} ∣ (x_{i}, y_{i})_{i = 1}^{n}) \propto 1 \leq i < j \leq n \prod

\times

\times

f (z_{i} ∣

f (z_{i} ∣

=

s_{k, i} = j : j \neq = i \sum I ((x_{j}, y_{j}, z_{j}) \in B_{k} ((x_{i}, y_{i}, z_{i}); θ_{k})), k = 1, 2,

s_{k, i} = j : j \neq = i \sum I ((x_{j}, y_{j}, z_{j}) \in B_{k} ((x_{i}, y_{i}, z_{i}); θ_{k})), k = 1, 2,

c_{i}

c_{i}

\displaystyle\qquad\qquad\qquad\qquad\times\mathbb{I}\bigg{(}\sum_{j:\,j\neq i}\mathbb{I}\left((x_{j},y_{j},z_{j})\in B_{1}((x_{i},y_{i},z);\theta_{1})\right)=k\bigg{)}

\displaystyle\qquad\qquad\qquad\qquad\times\mathbb{I}\bigg{(}\sum_{j:\,j\neq i}\mathbb{I}\left((x_{j},y_{j},z_{j})\in B_{2}((x_{i},y_{i},z);\theta_{2})\right)=l\bigg{)}\,\mathrm{d}z.

L P (γ_{1},

L P (γ_{1},

= i = 1 \sum n lo g f (z_{i} ∣ (z_{1}, \dots, z_{i - 1}, z_{i + 1}, \dots, z_{n}), (x_{j}, y_{j})_{j = 1}^{n}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Modelling columnarity of pyramidal cells in the human cerebral cortex

Andreas Dyreborg Christoffersen

Aalborg University

Jesper Møller and Heidi Søgaard Christensen

Aalborg University

Abstract

For modelling the location of pyramidal cells in the human cerebral cortex we suggest a hierarchical point process in $\mathbb{R}^{3}$ that exhibits anisotropy in the form of cylinders extending along the $z$ -axis. The model consists first of a generalised shot noise Cox process for the $xy$ -coordinates, providing cylindrical clusters, and next of a Markov random field model for the $z$ -coordinates conditioned on the $xy$ -coordinates, providing either repulsion, aggregation, or both within specified areas of interaction. Several cases of these hierarchical point processes are fitted to two pyramidal cell datasets, and of these a final model allowing for both repulsion and attraction between the points seem adequate. We discuss how the final model relates to the so-called minicolumn hypothesis in neuroscience.

Keywords: anisotropy; cylindrical $K$ -function; determinantal point process; hierarchical point process model; line cluster point process; Markov random field; minicolumn hypothesis; pseudo likelihood

1 Introduction

The structuring of neurons in the human brain is a subject of great interest since abnormal structures may be linked to certain neurological diseases (see Buxhoeveden and Casanova,, 2002; Casanova,, 2007; Casanova et al.,, 2006; Esiri and Chance,, 2006; Chance et al.,, 2011). A specific structure that has been extensively studied in the biological literature is the so called ’minicolumn’ structure of the cells in the cerebral cortex (see Buxhoeveden and Casanova,, 2002; Rafati et al.,, 2016, and references therein). Rafati et al., (2016) characterised these minicolumns as ‘linear aggregates of neurons organised vertically in units that traverse the cortical layer II–VI, and have in humans a diameter of 35–60 $\mu\mathrm{m}$ and consist typically of 80–100 neurons’.

1.1 Data

In this paper we analyse the structuring of pyramidal cells, which make up approximately 75–80% of all neurons (Buxhoeveden and Casanova,, 2002) and are pyramid shaped cells, where the so-called apical dendrite extends from the top/apex of a pyramidal cell. Specifically, the paper is concerned with pyramidal cells from the so-called Brodmann area four of the human cerebral cortex. The neocortex constitutes most of the cerebral cortex and can be divided into six layers. We consider two datasets consisting of the locations and orientations of pyramidal cells in a section of the third and fifth layer, respectively. Here, each location is a three-dimensional coordinate representing the centre of a pyramidal cell’s nucleolus and each orientation is a unit vector representing the apical dendrite’s position relative to the corresponding nucleolus.

The left part of Figure 1 shows two point pattern datasets of 634 and 548 nucleolus locations which will be referred to as L3 (top) and L5 (bottom), respectively (for plot of the orientations for L3, see Møller et al.,, 2019). Note that the observation window $W$ for the cell locations is a rectangular region with side lengths 492.70 $\mu\mathrm{m}$ , 132.03 $\mu\mathrm{m}$ , and 407.70 $\mu\mathrm{m}$ for L3 and 488.40 $\mu\mathrm{m}$ , 138.33 $\mu\mathrm{m}$ , and 495.40 $\mu\mathrm{m}$ for L5. Notice also that the nucleolus locations are recorded such that the $z$ -axis is perpendicular to the so-called pial surface of the brain. In accordance to the minicolumn hypothesis, this implies that the minicolumns extend parallel to the $z$ -axis. In the right part of Figure 1 we have therefore shown the 2D $xy$ -locations of the two 3D point pattern datasets.

1.2 Background and objective

Møller et al., (2019) found independence between locations and orientations for L3 meaning that the two components may be modelled separately; the same conclusion has afterwards been drawn for L5. As they also found a suitable inhomogeneous Poisson process model for the orientations, and since it is difficult to visually detect any structure in the point patterns shown in Figure 1, the focus of this paper is on modelling the nucleolus locations. In particular, we aim at modelling the nucleolus locations for L3 respective L5 by a spatial point process with a columnar structure and discuss to what extent this relates to the minicolumn hypothesis. Note that for the two datasets we use the same notation $X$ for the spatial point process, and we view $X$ as a random finite subset of $W$ .

To the best of our knowledge the so-called Poisson line cluster point process (see Møller et al.,, 2016) is the only existing point process model for modelling columnar structures. This model was considered by Rafati et al., (2016) in connection to another dataset of pyramidal nucleolus locations, but it was not fitted to that dataset. For each point pattern considered in the present paper, we notice later that a more advanced model than the Poisson line cluster point process is needed; below we describe such a model for $X$ .

1.3 Hierarchical point process models

We consider a hierarchical model for $X$ as follows. Note that the observation window is a product space, $W=W_{xy}\times W_{z}$ , where $W_{xy}$ is a rectangular region in the $xy$ -plane and $W_{z}$ is an interval on the $z$ -axis. Let $X_{xy}$ be the point process of $xy$ -coordinates in $X$ (i.e. given by the projection of $X$ onto the $xy$ -plane), associate to each point $(x_{i},y_{i})\in X_{xy}$ the corresponding $z$ -coordinate $z_{i}$ so that $(x_{i},y_{i},z_{i})\in X$ , and conditioned on the $xy$ -points $X_{xy}=\{(x_{i},y_{i})\}_{i=1}^{n}$ , let $X_{z}=(z_{i})_{i=1}^{n}$ be the random vector of $z$ -coordinates in $X$ (with an arbitrary ordering of these $n$ points and where $(z_{i})_{i=1}^{n}$ is short hand notation for $(z_{1},\ldots,z_{n})$ ). Clearly, $X$ is in a one-to-one correspondence with $X_{xy}$ and $X_{z}$ , and we model first $X_{xy}$ and second $X_{z}$ conditioned on $X_{xy}$ . Further details are given below and in Sections 3–5.

1.3.1 The model for $X_{xy}$

For $X_{xy}$ we consider the restriction of a cluster point process in $\mathbb{R}^{2}$ to $W_{xy}$ defined as follows. Let $\Phi$ be a stationary point process on $\mathbb{R}^{2}$ (i.e. its distribution is invariant under planar translations) with intensity $\kappa>0$ , and associate to each point $(\xi,\eta)\in\Phi$ a point process $X_{(\xi,\eta)}\subset\mathbb{R}^{3}$ that is concentrated around the line in $\mathbb{R}^{3}$ which is perpendicular to the $xy$ -plane, with intersection point $(\xi,\eta,0)$ . We refer to $X_{(\xi,\eta)}$ as the cylindrical cluster associated to $(\xi,\eta)$ . Let $P_{xy}(X_{(\xi,\eta)}\cap W)$ denote the projection onto the $xy$ -plane of the observed part of the cylindrical cluster. For short we refer to the non-empty $P_{xy}(X_{(\xi,\eta)}\cap W)$ as the projected cluster with centre point $(\xi,\eta)$ . Then we let

[TABLE]

Further, conditioned on $\Phi$ , we assume that the projected clusters are independent and each non-empty $P_{xy}(X_{(\xi,\eta)}\cap W)$ is distributed as the intersection of $W_{xy}$ with a finite planar Poisson process translated by the centre point $(\xi,\eta)$ ; this Poisson process has intensity function $a\alpha f$ , where $a$ is the length of the interval $W_{z}$ , $\alpha>0$ is a parameter, and $f$ is the probability density function of a bivariate zero-mean isotropic normal distribution with standard deviation $\sigma>0$ . Thus, ignoring boundary effects, $\alpha a$ is the expected size (or number of points) of a projected cluster and $\sigma$ controls the spread of points in a projected cluster. Specifically, we let first $\Phi$ be a planar stationary Poisson process and later a stationary determinantal point process (Lavancier et al.,, 2015), since later we observe for a fitted Poisson process a very low expected number of points in a projected cluster and that there is a need for a repulsive model in order to obtain less overlap between the projected clusters.

1.3.2 The model for $X_{z}$ conditioned on $X_{xy}$

Consider the special case where $\Phi$ is a planar stationary Poisson process and $X_{z}$ conditioned on $X_{xy}$ is a homogeneous binomial point process, that is, the $n$ points in $X_{z}$ are independent and uniformly distributed on $W_{z}$ . Note that $X_{z}$ depends only on $X_{xy}$ through the number of points in $X_{xy}$ . It becomes clear in Section 4 that this special case corresponds to a degenerate case of a Poisson line cluster point process (PLCPP) as considered in Møller et al., (2016).

In Section 5 we consider several other cases than a homogeneous binomial point process for $X_{z}$ conditioned on $X_{xy}$ . In the end, in comparison with the PLCPP model suggested in Møller et al., (2016) and Rafati et al., (2016) for the description of another dataset of pyramidal nucleous locations, we suggest and fit a rather complex hierarchical point process model. Our final model describes columnar structures of the nucleolus locations in each dataset L3 and L5, with repulsion between nucleolus locations given by a hard core condition on a small scale and a stunted cylindrical interaction region on a larger scale, as well as clustering between nucleolus locations given by an elongated cylindrical interaction region. In particular, our final fitted model describes a columnar structure but with much smaller columns than expected under the minicolumn hypothesis.

1.4 Outline

The remainder of this paper is organised as follows. In Section 2 we introduce some basic concepts and definitions needed when introducing and fitting the models in the subsequent sections. In Section 3 we investigate how the nucleolus locations deviate from complete spatial randomness. In Section 4 we also notice a deviation from a fitted degenerate PLCPP model. In Section 5 we introduce and fit various generalisations of the degenerate PLCPP model, including the final model briefly described in Sections 1.3.1–1.3.2. In particular, for estimation of parameters used in models for $X_{z}$ conditioned on $X_{xy}$ we propose in Section 5 a maximum pseudo likelihood procedure, the performance of which is studied in a simulation study reported in Appendix I. Finally, Section 6 summaries our findings and discuss directions for future research.

2 Preliminaries

The point processes $X$ , $X_{xy}$ , and $X_{z}$ introduced above can be viewed as the restriction to the bounded sets $W$ , $W_{xy}$ , and $W_{z}$ of a locally finite point process on $\mathbb{R}^{d}$ with $d=3,2,1$ , respectively. Below we recall a few basic statistical tools needed in this paper, using the generic notation $Y$ for a locally finite point process defined on $\mathbb{R}^{d}$ (apart from the cases above, we have in mind that $Y$ could also be the centre process $\Phi$ from Section 1.3). Briefly speaking, this means that $Y$ is a random subset of $\mathbb{R}^{d}$ satisfying that $Y_{B}=Y\cap B$ is finite for any bounded set $B\subset\mathbb{R}^{d}$ ; for a more rigorous definition of point processes, see e.g. Daley and Vere-Jones, (2003) or Møller and Waagepetersen, (2004).

2.1 Moments

For each integer $k\geq 1$ , to describe the $k$ ’th order moment properties of $Y$ , we consider the so-called $k$ ’th order intensity function $\lambda^{(k)}:(\mathbb{R}^{d})^{k}\rightarrow[0,\infty)$ given that it exists. This means that for any pairwise distinct and bounded Borel sets $B_{1},\ldots,B_{k}\subset\mathbb{R}^{d}$ ,

[TABLE]

is finite, where $n(Y_{B})$ denotes the cardinality of $Y_{B}$ .

The first order intensity function $\lambda^{(1)}=\lambda$ is of particular interest and is simply referred to as the intensity function. Heuristically, $\lambda(u)\,\mathrm{d}u$ can be interpreted as the probability of observing a point from $Y$ in the infinitesimal ball of volume $\mathrm{d}u$ centred at $u$ . If the intensity function $\lambda(\cdot)\equiv\lambda$ is constant, then $\lambda|B|=\text{E}\left[n(Y_{B})\right]$ for any bounded Borel set $B\subset\mathbb{R}^{d}$ , where $|\cdot|$ is the Lebesgue measure. In this case $Y$ is said to be homogeneous and otherwise inhomogeneous. Clearly, stationarity of $Y$ (meaning that its distribution is invariant under translations in $\mathbb{R}^{d}$ ) implies homogeneity.

2.2 Functional summaries

The functional summaries described in this section will be used both for model fitting as described in Section 2.3 and for model checking as described in Section 2.4.

To summarise the second order moment properties, it is custom to consider the pair correlation function (PCF), $g$ , which is defined as the ratio of the second and first order intensity function, that is,

[TABLE]

Heuristically, $\lambda(x_{1})\lambda(x_{2})g(x_{1},x_{2})$ is the probability of simultaneously observing a point from $X$ in each of the two infinitesimal balls of volume $\mathrm{d}x_{1}$ and $\mathrm{d}x_{2}$ centred at respectively $x_{1}$ and $x_{2}$ . For a Poisson process $Y$ , $g=1$ . The PCF is said to be stationary when (with abuse of notation) $g(x_{1},x_{2})=g(x_{1}-x_{2})$ ; this is the case when $Y$ is stationary.

If the PCF is stationary, it is closely related to the so-called second order reduced moment measure, $\mathcal{K}$ , given by

[TABLE]

where $B\subset\mathbb{R}^{d}$ is a Borel set (see Møller and Waagepetersen,, 2004). If $Y$ is stationary, then $\lambda\mathcal{K}(B)$ can be interpreted as the expected number of points in $Y\setminus\{o\}$ within $B$ given that $Y$ has a point at the origin $o$ of $\mathbb{R}^{d}$ ; when considering scalings of $B$ , we refer to $B$ as a structuring element. The simplest example occurs when $B$ is a ball centred at the origin and with radius $r>0$ ; then $K(r)=\mathcal{K}(B)$ becomes the $K$ -function introduced by Ripley, (1976); and often we instead consider a transformation of the $K$ -function, which is called the $L$ -function and defined by $L(r)=(K(r)/\omega_{d})^{1/d}$ , where $\omega_{d}$ is the volume of the $d$ -dimensional unit ball. In particular, if $Y$ is a stationary Poisson process, then $L(r)=r$ .

For detecting cylindrical structures, Møller et al., (2016) introduced the cylindrical $K$ -function which corresponds to $\mathcal{K}(B)$ when $B$ is a cylinder of height $2t$ , base-radius $r$ , and centre of mass at the origin. Note that Ripley’s $K$ -function depends only on one argument, $r$ , while the cylindrical $K$ -function depends both on $r$ , $t$ , and the direction of the cylinder. However, when $d=3$ and since the minicolumns are expected to extend along the $z$ -axis, we only consider cylinders extending in this direction, effectively reducing the number of arguments to two.

We will also consider the commonly used $F$ -, $G$ -, and $J$ -function when performing model control; see Van Lieshout and Baddeley, (1996) for definitions. Briefly, if $Y$ is stationary, $F(r)$ is the probability that $Y$ has a point within distance $r>0$ from an arbitrary fixed location in $\mathbb{R}^{d}$ ; $G(r)$ is the probability that $Y$ has another point within distance $r>0$ from an arbitrary fixed point in $Y$ ; and $J(r)=(1-G(r))/(1-F(r))$ when $F(r)<1$ .

2.3 Model fitting

In Møller et al., (2016) parameter estimation for the degenerate PLCPP model was simply done by a moment based procedure which included minimisation of a certain contrast between a theoretical second order moment functional summary and its empirical estimate. Below we describe a similar minimum contrast procedure for estimating the parameters of models for $X_{xy}$ . For the models of $X_{z}$ conditioned on $X_{xy}$ we find it convenient to use a maximum pseudo likelihood procedure as detailed in Section 5.3.2.

Minimum contrast estimation is a computationally simple fitting procedure introduced by Diggle and Gratton, (1984) that is applicable when a closed form expression of a functional summary, $T$ , exists. The idea is to minimise the distance from the theoretical function $T$ to its empirical estimate $\hat{T}$ for the data. Specifically, if $T$ depends on the parameter vector $\theta$ and is a function of ‘distance’ $r>0$ (as for example in case of Ripley’s $K$ -function), the minimum contrast estimate of $\theta$ is given by

[TABLE]

where $0\leq r_{\text{min}}<r_{\text{max}}$ , $q>0$ , and $p>0$ are tuning parameters. General recommendations on $q$ are given in Guan, (2009) and Diggle, (2014), when $T(r)=g(r)$ or $T(r)=K(r)$ ; when fitting a model to $X_{xy}$ , we let $p=2$ , $q=1/4$ , $r_{\text{min}}=0$ , and $r_{\text{max}}$ be one fourth of the shortest side length of $W_{xy}$ .

When the PCF has a closed form expression, alternative estimation procedures can be used, including the second order composite likelihood (see Guan,, 2006; Waagepetersen,, 2007), adapted second order composite likelihood (see Lavancier et al.,, 2018), and Palm likelihood (see Ogata and Katsura,, 1991; Prokešová et al.,, 2016; Baddeley et al.,, 2016).

2.4 GERL envelope procedure

For model checking we consider informative global extreme rank length (GERL) envelope procedures (Mrkvička et al.,, 2018; Myllymäki et al.,, 2017) based on various functional summaries as described below.

In the GERL envelope procedure, the distribution of an empirical functional summary is estimated by simulations under a fitted model of interest. The procedure is a refinement of the global rank envelope procedure (Myllymäki et al.,, 2017), where it is recommended to use 2499 simulations for a single one-dimensional functional summary and at least 9999 simulations for a single two-dimensional functional summary (Mrkvička et al.,, 2016). However, we consider a concatenation of the $L$ -, $G$ -, $F$ -, and $J$ -function in which case Mrkvička et al., (2017) recommend using more simulations. Particularly for a concatenation of $k$ one-dimensional summary functions they recommend using $k\times 2499$ simulations. For the GERL envelope procedure, Mrkvička et al., (2018) suggest that a lower number of simulations may be enough. Therefore, we use 9999 simulations. Since we consider a concatenation of one-dimensional functional summaries, we ensure that each of the functional summaries are weighted equally in the GERL envelope test by evaluating them at the same number of arguments (Mrkvička et al.,, 2017). Specifically, we consider 64 $r$ -values for each of the functions $L$ , $G$ , $F$ , and $J$ . The cylindrical $K$ -function is not part of the concatenation, and it will be evaluated over a square grid consisting of 64 $r$ -values and 64 $t$ -values.

2.5 Summary

We remark that the summary function used for model fitting will not be used for checking goodness of fit using the GERL envelope procedure. In summary, we use

•

empirical estimates of the cylindrical $K$ -function for investigating anisotropy and in particular columnarity in the 3D point patterns;

•

when considering isotropic point process models for the 2D projected locations (projections into the two-dimensional $xy$ -plane), Ripley’s $K$ -function for parameter estimation (both the theoretical $K$ -function and its parametric estimate are used as explained in Section 2.3) and empirical estimates of the $G$ -, $F$ -, and $J$ -function when checking for goodness of fit;

•

when checking for goodness of fit for anisotropic point process models fitted to the 3D point patterns, empirical estimates of both the $L$ -, $G$ -, $F$ -, and $J$ -function (which are not informative about anisotropy) and the cylindrical $K$ -function (which is informative about anisotropy).

3 Complete spatial randomness

The most natural place to begin our point pattern analysis is by testing whether a homogeneous Poisson process $X$ with intensity $\lambda>0$ (we then view $Y$ as a stationary Poisson process with the same intensity), also called complete spatial randomness (CSR), adequately describe each nucleolus point pattern dataset. Recall that this means that $n(X)$ is Poisson distributed with parameter $\lambda|W|$ and conditional on $n(X)$ the points in $X$ are independent and uniformly distributed within $W$ . We shall see that CSR is a too simple model for the description of L3 and L5, and that the noticed deviations from CSR become useful for suggesting new models.

The CSR model is fully specified by its intensity, which naturally is estimated by $n(X)/|W|$ , which is equal to $2.37\times 10^{-5}$ for L3 and $1.63\times 10^{-5}$ for L5. For that fitted model, Figure 2 summarises the results of the GERL envelope procedure, based on the empirical cylindrical $K$ -function with the cylindrical structuring element extending in the $x$ -, $y$ -, and $z$ -directions, along with the areas at which the empirical cylindrical $K$ -function falls outside the GERL $95\%$ envelope. It is clearly seen that CSR is rejected for both point patterns in all three directions. This is supported by the associated $p$ -values of $10^{-4}$ for all three directions for L3, and $4*10^{-4}$ for the $x$ and $y$ -directions and $10^{-4}$ for the $z$ -direction for L5. We notice that the empirical cylindrical $K$ -function extending in the $z$ -direction falls above the upper GERL envelopes for cylinders that have a height larger than approximately 35 $\mu\mathrm{m}$ for both datasets, together with a base radius of approximately 5–15 $\mu\mathrm{m}$ for L3 and 5–20 $\mu\mathrm{m}$ for L5. This trend is not observed for the empirical cylindrical $K$ -functions extending in the $x$ - and $y$ -directions. Furthermore, the observed cylindrical $K$ -functions extending in the $z$ -direction falls below the lower $95\%$ GERL envelope for cylinders with a height of approximately 10–25 $\mu\mathrm{m}$ and a base radius larger than 5 $\mu\mathrm{m}$ . We further observe that the empirical cylindrical $K$ -functions extending in the $x$ - and $y$ - directions falls below the lower $95\%$ GERL envelope for cylinders with a height of approximately 0–60 $\mu\mathrm{m}$ and a base radius of approximately 5–25 $\mu\mathrm{m}$ . Hence, for elongated cylinders extending in the $z$ -direction, we tend to see more points in the data than we expect under CSR, while for stunted cylinders we tend to see fewer points. Similarly, for cylinders that are neither very stunted or very elongated in the $x$ - and $y$ -directions, we see fewer points than we expect to see under CSR. This seems to be in correspondence with columnar structures where the columns extend in the $z$ -direction.

4 The degenerate Poisson line cluster point process

Møller et al., (2016) presented the so-called Poisson line cluster point process (PLCPP) which is useful for modelling columnar structures. Specifically, we consider a degenerate PLCPP $Y\subset\mathbb{R}^{3}$ constructed as follows.

Generate a stationary Poisson process $\Phi=\{(\xi_{i},\eta_{i})\}_{i=1}^{\infty}\subset\mathbb{R}^{2}$ with finite intensity $\kappa>0$ . Each point $(\xi_{i},\eta_{i})\in\Phi$ corresponds to an infinite line $l_{i}$ in $\mathbb{R}^{3}$ which is perpendicular to the $xy$ -plane, that is, $l_{i}=\left\{(\xi_{i},\eta_{i},z)\,|\,z\in\mathbb{R}\right\}$ . 2. 2.

Conditional on $\Phi$ , generate independent stationary Poisson processes $L_{1}\subset l_{1},L_{2}\subset l_{2},\ldots$ with identical and finite intensity $\alpha>0$ . 3. 3.

Generate point processes $X_{1},X_{2},\ldots\subset\mathbb{R}^{3}$ by independently displacing the points of $L_{1},L_{2},\ldots$ by the zero-mean isotropic normal distribution with standard deviation $\sigma>0$ . 4. 4.

Finally, set $Y=\bigcup_{i=1}^{\infty}X_{i}$ and $X=Y_{W}$ .

Some comments to the construction in items 1–4 are in order.

In the general definition of the PLCPP in Møller et al., (2016), the lines $l_{1},l_{2},\ldots$ are modelled as a stationary Poisson line process. That is, the lines are not required to be perpendicular to the $xy$ -plane nor does the Poisson line process need to be degenerate (meaning that the lines are not required to be mutually parallel). Further, the dispersion density (used in item 3) can be arbitrary. However, the construction is still such that $Y$ becomes stationary. Furthermore, the same distribution of $Y$ is obtained whether we consider a three-dimensional zero-mean isotropic normal distribution for displacements in item 3 or a bivariate zero-mean isotropic normal distribution with displacements of the $xy$ -coordinates for the points of $L_{1},L_{2},\ldots$ , provided the variances of the two normal distributions are identical, cf. Møller et al., (2016).

Returning to the degenerate PLCPP of items 1–4, we imagine that each $X_{i}$ is a cylindrical cluster of points around the line $l_{i}$ , where these cylindrical clusters are parallel to the $z$ -axis. Furthermore, the interpretation of the parameters $\kappa$ , $\alpha$ , and $\sigma$ in terms of a Poisson cluster point process is similar to that in Section 1.3.1 except that we now also consider lines not intersecting $W$ : if $Y$ as defined by items 1–4 is restricted to a subset $S\subset\mathbb{R}^{3}$ bounded by two planes parallel to the $xy$ -plane, for specificity $S=\left\{(x,y,z)\in\mathbb{R}^{3}\,|\,z\in W_{z}\right\}$ , this restricted point process can be seen as a (modified) Thomas process (see Thomas,, 1949; Møller and Waagepetersen,, 2004) on $\mathbb{R}^{2}$ along with independent $z$ -coordinates following a uniform distribution on $W_{z}$ .

To see this, first note that conditional on $\Phi=\{(\xi_{i},\eta_{i})\}_{i=1}^{\infty}$ and for all $i=1,2,\ldots$ , $X_{i}$ is a Poisson process in $\mathbb{R}^{3}$ with intensity function $\lambda_{i}((x,y,z))=\alpha f(x-\xi_{i},y-\eta_{i})$ , where $f$ is the probability density function of the bivariate isotropic normal distribution given in item 3. In turn, this implies that $Y$ conditioned on $\Phi$ is a Poisson process in $\mathbb{R}^{3}$ with intensity function $\sum_{i=1}^{\infty}\lambda_{i}((x,y,z))$ . Further, since $\lambda_{i}(x,y,z)=\lambda_{i}(x,y)$ does not depend on $z$ for all $i=1,2,\ldots$ , the projection of $Y_{S}$ onto the $xy$ -plane, $P_{xy}(Y_{S})$ , conditioned on $\Phi$ is a Poisson process with intensity $a\sum_{i=1}^{\infty}\lambda_{i}(x,y)$ , where $a$ is the length of the interval $W_{z}$ . Since $\Phi$ is a stationary Poisson process, $P_{xy}(Y_{S})$ is a Thomas process with centre process intensity $\kappa$ and expected cluster size $\alpha a$ (that is, the expected number of points in $X_{i}\cap S$ ). Finally, from items 2–4 it follows that the $z$ -coordinates of $X_{z}$ are independent and uniformly distributed on $W_{z}$ , and they are independent of $X_{xy}$ .

Consequently, simulating $X=Y_{W}$ is straightforwardly done by simulating a Thomas point process (on a larger set than $W_{xy}$ in order to avoid boundary effects) along with independent uniform $z$ -coordinates on $W_{z}$ . For simulating the Thomas point process we apply standard software from the R-package spatstat (Baddeley et al.,, 2016). Similarly, fitting a degenerate PLCPP to a realisation of $X$ is simply a matter of fitting a Thomas process to the point pattern consisting of the $xy$ -coordinates of the points in that realisation. Since the $K$ -function of the Thomas process has a closed form expression, the model can easily be fitted using minimum contrast estimation with $T(r)=K(r)$ in (1). Table 1 summarises the parameter estimates, where most notably the expected cluster size $\widehat{\alpha a}$ is $<1$ for both L3 and L5. Understanding each cylindrical cluster within $W$ as (a part of) a minicolumn, ‘these parameter estimates result in very unnatural models for the datasets, since each minicolumn within $W$ is expected to consist of less than one point’ (personal communication with Jens R. Nyengaard).

Despite the fact that the fitted degenerate PLCPP models are somewhat unnatural and hardly can be interpreted as a model with (the hypothesised) minicolumnar structures, GERL envelope procedure based on a concatenation of the $F$ -, $G$ -, and $J$ -function shows that the Thomas process suitably fit the projected locations (projections into the two-dimensional $xy$ -plane) with a $p$ -value of 0.76 for L3 and 0.87 for L5. However, results from the concatenated GERL envelope procedure described in Section 2.4 indicated that the model did not suitably describe the three-dimensional nucleolus locations with $p$ -values of $10^{-4}$ and $10^{-4}$ for L3 and $10^{-4}$ and $6*10^{-4}$ for L5 when using the GERL envelope procedure based on the concatenation of the one-dimensional summary functions ( $L-r$ , $G$ , $F$ , $J$ ) and the cylindrical $K$ -function in the $z$ -direction, respectively.

Specifically, Figure 3 shows the empirical cylindrical $K$ -function and indicates where it deviates from the 95% GERL envelope. Clearly, the model does account for some of the hypothesized columnarity of the data as opposed to CSR, but the empirical cylindrical $K$ -function for L3 still falls above the 95% GERL envelope. Furthermore, the empirical cylindrical $K$ -function for both datasets falls below the 95% GERL envelope similar to what was seen under CSR, indicating a lack of regularity of the model, which in fact is supported by the one-dimensional functional summaries (not shown). This could suggest that the cylindrical clusters should be more distinct; motivating us to generalise the degenerate PLCPP model as in the following section.

5 A generalisation of the degenerate PLCPP

As some but not all features of the data were explained by the degenerate PLCPP fitted in Section 4, we propose in this section two generalisations as follows.

The centre process $\Phi$ is a planar stationary point process. 2. 2.

$X_{z}$ conditioned on $X_{xy}$ is a Markov random field.

The first modification is straightforward and although the Thomas point process suitably fitted the projected point patterns we found the parameter estimates unnatural for describing cylindrical clustering as discussed in Section 4. Therefore we chose a repulsive centre process in order to obtain more distinguishable cylindrical clusters; this is detailed in Section 5.1. Further, the assumption of stationarity of $\Phi$ is made in order to apply a similar minimum contrast estimation procedure as in Section 4, so implicitly we make the assumption that the PCF or the $K$ -function is expressible in a closed form. For the second modification we suggest a conditional model inspired by the multiscale point process and particularly the Strauss hard core point process (see e.g. Møller and Waagepetersen,, 2004) which will allow for further repulsion or even aggregation between the points; this is detailed in Sections 5.2–5.3.

5.1 A determinantal point process model for the centre points

Consider a point process $Y\subset\mathbb{R}^{3}$ specified by items 1–4 in Section 4 with the exception that the centre process $\Phi$ now is an arbitrary stationary planar point process. Then, recalling the notation from Section 4, $P_{xy}(Y_{S})$ is a planar Cox process (see Møller and Waagepetersen,, 2004) and even a planar generalised shot-noise Cox process (see Møller and Torrisi,, 2005) driven by the random intensity function $\Lambda(x,y)=a\sum_{i=1}^{\infty}\lambda_{i}(x,y)$ for $(x,y)\in\mathbb{R}^{2}$ . Moreover, $P_{xy}(Y_{S})$ corresponds to the Thomas process, but with a different centre point process (unless of course $\Phi$ is a stationary Poisson process).

In this section we focus on the case where $\Phi$ is a stationary determinantal point process (DPP; see Lavancier et al.,, 2015), in which case we will refer to $Y$ as the determinantal line cluster point process (DLCPP). Let $C:\mathbb{R}^{2}\times\mathbb{R}^{2}\rightarrow\mathbb{C}$ be a function, then $\Phi$ is a DPP with kernel $C$ if its intensity functions satisfy

[TABLE]

where $\text{det}[C](u_{1},\ldots,u_{k})$ is the determinant of the $k\times k$ matrix with $(i,j)$ ’th entry $C(u_{i},u_{j})$ . For further details on DPPs, we refer to Lavancier et al., (2015) and the references therein. When $\Phi$ is a DPP, we call $P_{xy}(Y_{S})$ a determinantal Thomas point process (DTPP). The DTPP is discussed to some extent in Møller and Christoffersen, (2018), where a closed form expression of its PCF is found. Thus, the DLCPP can be fitted by fitting a DTPP to the projected data using a minimum contrast procedure (see Section 2.3).

Specifically, we let $\Phi$ be the jinc-like DPP given by the kernel $C(u_{1},u_{2})=\sqrt{\kappa/\pi}J_{1}\left(2\sqrt{\pi\kappa}\|u_{1}-u_{2}\|\right)/\|u_{1}-u_{2}\|$ , where $\kappa>0$ is the intensity of $\Phi$ , $J_{1}$ is the first order Bessel function of the first kind, and $\|\cdot\|$ denotes the usual planar distance. In the sense of Lavancier et al., (2015), this is the most repulsive DPP with a stationary kernel (see also Biscio and Lavancier,, 2016). Simulation of the DTPP is done by first simulating $\Phi$ on a larger region than $W_{xy}$ in order to avoid boundary effects, for which we use the functionality of spatstat; second we generate for each cluster a Poisson distributed number of points with intensity $\alpha a$ ; and third we displace these points by a bivariate zero-mean isotropic normal distribution.

The parameter estimates of the jinc-like DTPP model were obtained by minimum contrast with $T(r)=g(r)$ . The parameter estimates are given in Table 2, where the estimated expected cluster size $\widehat{\alpha a}$ is ‘much smaller than expected for a minicolumn when restricting it to the observation window – provided the minicolumn hypothesis is true’ (personal communication with Jens R. Nyengaard). So we neither claim that we have a fitted model for minicolumns nor that the minicolumn hypothesis is true. Instead we have fitted a model with cylindrical clusters: from Table 2 we see, if $|W_{xy}|$ denotes the area of $W_{xy}$ , the estimated number of projected clusters is $|W_{xy}|\hat{\kappa}$ , which is approximately $260$ for L3 and $142$ for L5; the estimated expected size of a projected cluster is only 2.42 for L3 and 3.87 for L5.

Despite the expectation under the minicolumn hypothesis of having much higher values of $\widehat{\alpha a}$ than in Table 2, simulations of the fitted jinc-like DPP in the $xy$ -plane seem in reasonable correspondence to the projected data; see Figure 4. Furthermore, results from the GERL envelope procedure based on a concatenation of the $F$ -, $G$ -, and $J$ -function do not provide any evidence against the jinc-like DPP model for the projected points with $p$ -values of 0.67 for L3 and 0.83 for L5.

Since the jinc-like DTPP model fits the projected data well, we proceeded and added independent uniform $z$ -coordinates on $W_{z}$ to the simulations, thereby obtaining simulations of the jinc-like DLCPP. Figure 5 summarises the result of the 95% GERL test based on the concatenation of one-dimensional functional summaries as well as the cylindrical $K$ -function as described in Section 2.2. These plots show that the models do not account for the regularity of the data, but accounts for more clustering compared to the degenerate PLCPP discussed in Section 4. This leads us to our next generalisation in Section 5.2 and the specific models in Section 5.3.

5.2 General MRF models for the $z$ -coordinates given the $xy$ -coordinates

Motivated by the observations at the end of the previous section, in this section we propose to model the vector of $z$ -coordinates conditioned on the $xy$ -coordinates by a pairwise interaction MRF given in (2) below.

In general, conditioned on $X_{xy}=\{(x_{i},y_{i})\}_{i=1}^{n}$ , we propose a model for $X_{z}$ with a conditional probability density function of the form

[TABLE]

with notation defined as follows. The right hand side in (2) is an unnormalised (conditional) density with respect to $n$ -fold Lebesgue measure on $W_{z}$ ; $\mathbb{I}(\cdot)$ denotes the indicator function; and $\gamma_{1}>0$ , $\gamma_{2}>0$ , and $h\geq 0$ are unknown parameters. When $h>0$ , it is a hard core parameter ensuring a minimum distance $h$ between all pair of points in $X$ ; for the pyramidal cell data it seems natural to include a hard core condition since the cells cannot overlap. Clearly, in order that the conditional density is well-defined, $h$ needs to be smaller that the length of $W_{z}$ (which is much larger than the diameter of a cell). Furthermore, for $k=1,2$ ,

[TABLE]

where $B_{k}(x,y,z;\theta_{k})\subset\mathbb{R}^{3}$ is an interaction region, with centre of mass $(x,y,z)$ and a ‘size and shape parameter’, $\theta_{k}$ , that determines the interaction between points. It is additionally assumed that the hard core ball, given by the three-dimensional closed ball of radius $h$ and centre $(x,y,z)$ contains neither $B_{1}(x,y,z;\theta_{1})$ nor $B_{2}(x,y,z;\theta_{2})$ . Finally, it is assumed that the symmetry condition

[TABLE]

and the disjointness condition

[TABLE]

are satisfied. The symmetry condition is imposed to ensure that we can interchange the roles of $i$ and $j$ .

If $\gamma_{1}=\gamma_{2}=1$ and $h=0$ , then $X_{z}$ becomes the homogeneous binomial point process which depends only on $X_{xy}$ through the number of points in $X_{xy}$ . In general, using Markov random field (MRF) terminology, $X_{z}$ conditioned on $X_{xy}=\{x_{i},y_{i}\}_{i=1}^{n}$ is a pairwise interaction MRF with sites given by the lattice $\{x_{i},y_{i}\}_{i=1}^{n}$ , where two sites $(x_{i},y_{i})$ and $(x_{j},y_{j})$ are neighbours if and only if there exist $z_{i}^{\prime},z_{j}^{\prime}\in W_{z}$ and a $k\in\{1,2\}$ such that $(x_{i},y_{i},z_{i}^{\prime})\in B_{k}(x_{j},y_{j},z_{j}^{\prime};\theta_{k})$ or $\|(x_{i},y_{i},z_{i}^{\prime})-(x_{j},y_{j},z_{j}^{\prime})\|\leq h$ . In other words, the conditional distribution of $z_{i}$ (the $i$ ’th coordinate in $X_{z}$ ) given both $\{x_{i},y_{i}\}_{i=1}^{n}$ and all $z_{j}$ with $j\not=i$ depends only on those $z_{j}$ where $(x_{i},y_{i})$ and $(x_{j},y_{j})$ are neighbours. We may also say that $(x_{i},y_{i},z_{i})$ and $(x_{j},y_{j},z_{j})$ interact if $(x_{i},y_{i},z_{i})$ is in the union of $B_{1}(x_{j},y_{j},z_{j};\theta_{1})$ and $B_{2}(x_{j},y_{j},z_{j};\theta_{2})$ – we refer to this union of sets as the region of interaction of $(x_{j},y_{j},z_{j})$ (note that we can interchange the roles of $i$ and $j$ ) or if $\|(x_{i},y_{i},z_{i})-(x_{j},y_{j},z_{j})\|\leq h$ (that is, the hard core condition is not satisfied, which happens with probability 0). The interaction can either cause repulsion/inhibition or attraction/clustering of the points in $X$ depending on whether $\gamma_{k}<1$ or $\gamma_{k}>1$ for $k=1,2$ . Thus, apart from the hard core condition, the model allows for both repulsion and attraction but within different interaction regions $B_{1}$ and $B_{2}$ .

Note that our hierarchical model construction yields a more flexible model for $X$ but we ignore edge effects in the sense that we have only specified a model for first $P_{xy}(Y_{S})$ and second $X_{z}$ conditioned on $X_{xy}=P_{xy}(Y_{S})\cap W_{xy}$ , thereby ignoring a possible influence of points in $Y\setminus W$ when (2) is used in the latter step (unless it specifies a binomial point process). This simplification is just made for mathematical convenience; indeed it would be interesting to construct a model taking edge effects into account so that $Y$ becomes stationary, but we leave this challenging issue for future research.

5.3 Fitting specific MRF models for the $z$ -coordinates given the $xy$ -coordinates

Below we first specify the ingredients of the conditional probability density function given in (2) for various models and discuss the overall conclusions, next describe how to find parameter estimates, and finally discuss how well the estimated models fit the data. Note that although we have not specified a stationary model for $Y$ , it still make sense to interpret plots of the empirical cylindrical $K$ -function and of the $\hat{F}$ -, $\hat{G}$ -, $\hat{J}$ -, and $\hat{L}$ -function, since we have stationarity in the $xy$ -plane and approximately stationarity in the $z$ -direction (as the density (2) is invariant under ‘translations of $(z_{1},\ldots,z_{n})$ within $W_{z}$ ’).

5.3.1 Selected models

In our search for a suitable model for the nucleolus locations, we considered many special cases of (2). Table 3 summarises five selected models, where $b((x,y,z);r)$ is the ball with centre $(x,y,z)$ and radius $r$ , and where $c((x,y,z);r,t)$ and $d((x,y,z);r,t)$ denote the cylinder and double cone, respectively, with centre of mass at $(x,y,z)$ , height $2t$ , base radius $r$ , and extending in the $z$ -direction. Note that in Table 3 we do not need to specify $B_{k}(\cdot;\theta_{k})$ when $\gamma_{k}=1$ . For the final model 5, (2) becomes the conditional density

[TABLE]

where the cylindrical interaction regions $B_{1}(x_{j},y_{j},z_{j};\theta_{1})=c(x_{j},y_{j},z_{j};r_{1},t_{1})$ and $B_{2}(x_{j},y_{j},z_{j};\theta_{2})=c(x_{j},y_{j},z_{j};r_{2},t_{2})\setminus c(x_{j},y_{j},z_{j};r_{1},t_{1})$ are illustrated in Figure 6, $h\geq 0$ is still the hard core parameter, $\gamma_{1}>0$ and $\gamma_{2}>0$ are interaction parameters, and $0<r_{2}\leq r_{1}$ and $0<t_{1}<t_{2}$ are parameters which determine the ‘range of interaction’ satisfying $h<\sqrt{t_{k}^{2}+r_{k}^{2}}$ for $k=1,2$ . These restrictions on the parameters are empirically motivated by use of functional summaries as detailed below.

First, we considered model 1 which is a hard core model if $h>0$ and one of the simplest ways of modelling regularity; note that model 1 with $h=0$ is the binomial point process with a uniform density as considered in Section 4. Though accounting for small distance repulsion, when fitted to the data, model 1 turned out not to account for the repulsion at larger scales. Second, we considered model 2 which is a conditional Strauss model with a hard core condition (see Møller and Waagepetersen,, 2004, and the references therein). For this model the scale of repulsion for the $z$ -coordinates seemed too great for points with similar $xy$ -coordinates, and therefore we found it natural to replace the spherical interaction region with a cylinder, yielding model 3. However, model 3 did not correct the problem, and continuing with a single region of interaction we next suggested model 4 with a region given by a cylinder minus a double cone. Model 4 does to a smaller degree penalise the occurrence of points with similar $xy$ -coordinates. However, this model was not suitable either. Models 1–4 were discarded by GERL tests with extremely small $p$ -values. Finally, we considered model 5 which is a more flexible model that allows for both repulsion and aggregation within cylinder shaped interaction regions, cf. the discussion in Section 1.3.2. For simplicity all the models were also considered without a hard core condition, that is $h=0$ , but was in every case found inadequate.

5.3.2 Results based on maximum pseudo likelihood

The likelihood function corresponding to (2) involves a normalising constant which needs to be approximated by Markov chain Monte Carlo methods. We propose an easier alternative based on the pseudo likelihood function (Besag,, 1975) defined as follows when the data is given by $\{(x_{i},y_{i},z_{i})\}_{i=1}^{n}\subset W$ . For $i=1,\ldots,n$ , the $i$ ’th full conditional density associated to (2) is

[TABLE]

where we define

[TABLE]

and where the normalising constant is given by

[TABLE]

To estimate the model parameters we maximise the log pseudo likelihood given by

[TABLE]

Clearly, by (3) the maximum pseudo likelihood estimate (MPLE) $\hat{h}$ of $h$ is the minimum distance between any distinct pair of points $(x_{i},y_{i},z_{i})$ and $(x_{j},y_{j},z_{j})$ in the data. This in fact also corresponds to the maximum likelihood estimate. For $h=\hat{h}$ and for fixed $\theta_{1}$ and $\theta_{2}$ , we easily obtain the following. For each of models 2–4, the MPLE of $\gamma_{1}$ exists if and only if $s_{1,i}\not=0$ for some $i$ , and then the log pseudo likelihood function is strictly concave with respect to $\log\gamma_{1}$ . For model 5, the MPLE of $(\gamma_{1},\gamma_{2})$ exists if and only if $s_{1,i}\not=0$ for some $i$ and $s_{2,j}\not=0$ for some $j$ , and then the log pseudo likelihood function is strictly concave with respect to $(\log\gamma_{1},\log\gamma_{2})$ . Therefore, the (profile) log pseudo likelihood can be maximised by a combination of a grid search over $\theta_{1}$ and $\theta_{2}$ and numerical optimisation with respect to $\gamma_{1}$ and $\gamma_{2}$ .

Each of the five models in Table 3 were fitted to L3 and L5 by finding the maximum pseudo likelihood estimate, where for the numerical optimisation we used optim (a general-purpose optimisation function from the R-package stats). Table 4 shows the maximum pseudo likelihood estimates of model 5 for the two datasets. Most notably, around each point in $X$ , there is a repulsive interaction region (since $\hat{\gamma}_{1}<1$ ) within a stunted cylinder $B_{1}$ and an attractive interaction region (since $\hat{\gamma}_{2}>1$ ) within the union $B_{2}$ of two elongated cylinders, see again Figure 6. In particular the fitted model is in accordance to the empirical findings as noted later when the cylindrical $K$ -function extending in the $z$ -direction of Figure 2 is discussed. Specifically, there is repulsion between two points $(x_{i},y_{i},z_{i})$ and $(x_{j},y_{j},z_{j})$ in $X$ if $(x_{i},y_{i})$ and $(x_{j},y_{j})$ lie within distance 20 $\mu\mathrm{m}$ for L3 and 24.25 $\mu\mathrm{m}$ for L5 and if $z_{i}$ and $z_{j}$ lie within distance 11.5 $\mu\mathrm{m}$ for L3 and 15.5 $\mu\mathrm{m}$ for L5. Analogously, there is attraction between $(x_{i},y_{i},z_{i})$ and $(x_{j},y_{j},z_{j})$ if $(x_{i},y_{i})$ and $(x_{j},y_{j})$ lie within distance 11 $\mu\mathrm{m}$ for L3 and 14.75 $\mu\mathrm{m}$ for L5 and if $|z_{i}-z_{j}|$ is between 11.5–35.5 $\mu\mathrm{m}$ for L3 and 15.5–37.25 $\mu\mathrm{m}$ for L5. Moreover, the estimated hard core $\hat{h}$ is between 6-7 $\mu\mathrm{m}$ , which is in accordance with ‘distance between the nucleolus and the membrane of a pyramidal cell’ (personal communication with Jens R. Nyengaard). Note that $2\hat{h}$ (the diameter of an estimated hard core ball) is about half as small as $2\hat{t}_{1}$ (the estimated height of $B_{1}$ ). Finally, comparing Tables 2-4, we note that the two ‘clustering parameters’ $2\hat{\sigma}$ and $\hat{r}_{2}$ are of the same order.

Model checking was performed using GERL envelope procedures based on the concatenation of one-dimensional functional summaries as well as the cylindrical $K$ -function as discussed in Section 2.4. For the fitted models, model 5 was the most appropriate with $p$ -values of 0.15 and 0.32 for L3 and 0.23 and 0.02 for L5 when using the GERL envelope procedure based on the concatenation of the one-dimensional summary functions ( $L-r$ , $G$ , $F$ , $J$ ) and the cylindrical $K$ -function in the $z$ -direction, respectively; the 95% GERL envelopes are visualised in Figure 7. Thus no evidence is seen against the fitted models summarised in Table 4 for L3 while only slight evidence is present for L5.

Finally, note that simulations from each of models 1–5 can straightforwardly be obtained using a Metropolis-Hastings algorithm for a fixed number of points and given a realisation of the $xy$ -coordinates. Specifically, we used Algorithm 7.1 in Møller and Waagepetersen, (2004) but with a systematic updating scheme cycling over the point indexes 1 to $n$ , using a uniform proposal for a new point in $W_{z}$ and a Hastings ratio calculated from the full conditional (3). We successively updated each point 100 times under the systematic updating scheme, corresponding to 63400 and 54800 point updates for L3 and L5, respectively.

6 Concluding remarks

The structure of minicolumns (clusters) of points has been investigated in relation to illnesses such as Down’s syndrom (Buxhoeveden et al.,, 2002), schizophrenia (Casanova,, 2007), autism (Casanova et al.,, 2006), and Alzheimer’s disease (Esiri and Chance,, 2006; Chance et al.,, 2011). However, none of the methods used treat the positions of cells as a 3D point pattern. This paper has contributed to 3D point process methodology by developing new models and inference procedures based on a combination of moment based estimation and maximum pseudo likelihood estimation, and by illustrating the usefulness of various 3D functional summaries, in particular the cylindrical $K$ -function, in combination with global rank envelope test procedures.

We fitted several hierarchical models for the two 3D point pattern datasets of nucleolus locations and related our findings to the minicolumn hypothesis, which claims that minicolumns extend parallel to the $z$ -axis. Starting with a homogeneous Poisson process and second with the degenerated Poisson line cluster point process model in Møller et al., (2016), we proceeded by developing more advanced point process models for a columnar structure along the $z$ -axis. Our final model can be summarized as follows. For the $xy$ -coordinates, we argued a need for a cluster point process given by a generalized shot noise Cox process, where the cluster centres are described by a repulsive point process given by a stationary determinantal point process, and where the offspring distribution is given by a bivariate isotropic normal distribution. For the $z$ -coordinates conditioned on the $xy$ -coordinates, we specified a pairwise interaction Markov random field model with repulsion between nucleolus locations given by a hard core condition on a small scale and a stunted cylindrical interaction region on a larger scale, as well as clustering between nucleolus locations given by an elongated cylindrical interaction region. The final model specifies much smaller columns than expected under the minicolumn hypothesis. Although the same type of model describes the two datasets, very different parameter estimates were obtained.

All the studies in Buxhoeveden et al., (2002), Casanova et al., (2006), Casanova, (2007), and Chance et al., (2011) show deviations in the minicolumn structure in the brains of sick subjects compared to normal ones. For future applications with several 3D point pattern datasets belonging to different groups (related to normal and sick objects), it remains to develop statistical methods for comparison of the groups. We hope that our methodology may serve as an inspiration for this and other future developments.

Acknowledgements

This work was supported by The Danish Council for Independent Research | Natural Sciences, grant DFF – 7014-00074 ‘Statistics for point processes in space and beyond’, and by the ‘Centre for Stochastic Geometry and Advanced Bioimaging’, funded by grant 8721 from the Villum Foundation. We are thankful to Ali H. Rafati for collecting the data analysed in this paper and to Jens R. Nyengaard and Ninna Vihrs for helpful comments.

Appendix A Appendix

A simulation study was performed in order to investigate the performance of the maximum pseudo likelihood estimation procedure used for fitting our final model (model 5 from Table 3) with a DTPP for the centre points in the $xy$ -plane. The simulations were obtained by

simulating 100 DTPPs with $\kappa=0.0040$ , $\sigma=5.45$ , and $\alpha a=2.42$ , corresponding to the parameter estimates in Table 2 for the dataset L3; 2. 2.

for each of the 100 simulated DTPPs, simulating the associated vector of $z$ -coordinates from the pairwise interaction MRF specified by the conditional density (2) and with regions of interaction as specified for model 5 in Table 3. For this we used a Metropolis-Hastings algorithm as described in Section 5.2 with $\gamma_{1}=0.41$ , $\gamma_{2}=1.78$ , $h=6.25$ , $r_{1}=20$ , $t_{1}=11.5$ , $r_{2}=11$ , $t_{2}=35.5$ , corresponding to the parameter estimates in Table 4 for the dataset L3.

The results of the simulation study are presented in Table A.1 and Figure A.1. Despite the high number of parameters, the estimates are distributed with most mass close to true parameter values and with a small standard deviation. Not surprisingly, the hard core parameter $h$ is an exception that is biased with a right skewed distribution, since the simulated point patterns always have to satisfy the hard core condition, whilst for a given simulated point pattern the estimate of $h$ is the shortest distance between any two points.

Bibliography32

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Baddeley et al., (2016) Baddeley, A., Rubak, E., and Turner, R. (2016). Spatial Point Patterns: Methodology and Applications with R . Chapman & Hall/CRC, New York.
2Besag, (1975) Besag, J. (1975). Statistical analysis of non-lattice data. The Statistician , 24:179–195.
3Biscio and Lavancier, (2016) Biscio, C. A. N. and Lavancier, F. (2016). Quantifying repulsiveness of determinantal point processes. Bernoulli , 22:2001–2028.
4Buxhoeveden et al., (2002) Buxhoeveden, D., Fobbs, A., Roy, E., and Casanova, M. (2002). Quantitative comparison of radial cell columns in children with Down’s syndrome and controls. Journal of Intellectual Disability Research , 46:76–81.
5Buxhoeveden and Casanova, (2002) Buxhoeveden, D. P. and Casanova, M. F. (2002). The minicolumn hypothesis in neuroscience. Brain , 125:935–951.
6Casanova, (2007) Casanova, M. F. (2007). Schizophrenia seen as a deficit in the modulation of cortical minicolumns by monoaminergic systems. International Review of Psychiatry , 19:361–372.
7Casanova et al., (2006) Casanova, M. F., van Kooten, I. A. J., Switala, A. E., van Engeland, H., Heinsen, H., Steinbusch, H. W. M., Hof, P. R., Trippe, J., Stone, J., and Schmitz, C. (2006). Minicolumnar abnormalities in autism. Acta Neuropathologica (Berl) , 112:287–303.
8Chance et al., (2011) Chance, S. A., Clover, L., Cousijn, H., Currah, L., Pettingill, R., and Esiri, M. M. (2011). Microanatomical correlates of cognitive ability and decline: Normal ageing, MCI, and Alzheimer’s disease. Cerebral Cortex , 21:1870–1878.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Modelling columnarity of pyramidal cells in the human cerebral cortex

Abstract

1 Introduction

1.1 Data

1.2 Background and objective

1.3 Hierarchical point process models

1.3.1 The model for XxyX_{xy}Xxy​

1.3.2 The model for XzX_{z}Xz​ conditioned on XxyX_{xy}Xxy​

1.4 Outline

2 Preliminaries

2.1 Moments

2.2 Functional summaries

2.3 Model fitting

2.4 GERL envelope procedure

2.5 Summary

3 Complete spatial randomness

4 The degenerate Poisson line cluster point process

5 A generalisation of the degenerate PLCPP

5.1 A determinantal point process model for the centre points

5.2 General MRF models for the zzz-coordinates given the xyxyxy-coordinates

5.3 Fitting specific MRF models for the zzz-coordinates given the xyxyxy-coordinates

5.3.1 Selected models

5.3.2 Results based on maximum pseudo likelihood

6 Concluding remarks

Acknowledgements

Appendix A Appendix

1.3.1 The model for $X_{xy}$

1.3.2 The model for $X_{z}$ conditioned on $X_{xy}$

5.2 General MRF models for the $z$ -coordinates given the $xy$ -coordinates

5.3 Fitting specific MRF models for the $z$ -coordinates given the $xy$ -coordinates