Fully automated MRI‐based analysis of the locus coeruleus in aging and Alzheimer's disease dementia using ELSI‐Net

Max Dünnwald; Friedrich Krohn; Alessandro Sciarra; Mousumi Sarkar; Anja Schneider; Klaus Fliessbach; Okka Kimmich; Frank Jessen; Ayda Rostamzadeh; Wenzel Glanz; Enise I. Incesoy; Stefan Teipel; Ingo Kilimann; Doreen Goerss; Annika Spottke; Johanna Brustkern; Michael T. Heneka; Frederic Brosseron; Falk Lüsebrink; Dorothea Hämmerer; Emrah Düzel; Klaus Tönnies; Steffen Oeltze‐Jafra; Matthew J. Betts

PMC · DOI:10.1002/dad2.70118·May 12, 2025

Fully automated MRI‐based analysis of the locus coeruleus in aging and Alzheimer's disease dementia using ELSI‐Net

Max Dünnwald, Friedrich Krohn, Alessandro Sciarra, Mousumi Sarkar, Anja Schneider, Klaus Fliessbach, Okka Kimmich, Frank Jessen, Ayda Rostamzadeh, Wenzel Glanz, Enise I. Incesoy, Stefan Teipel, Ingo Kilimann, Doreen Goerss, Annika Spottke, Johanna Brustkern, Michael T. Heneka

PDF

Open Access

TL;DR

This paper introduces ELSI-Net, an automated MRI-based method to analyze the locus coeruleus in aging and Alzheimer's disease, showing strong agreement with expert ratings and correlations with disease biomarkers.

Contribution

A novel deep learning method for automated LC segmentation and feature extraction that achieves high agreement with manual ratings and detects LC changes in aging and AD.

Findings

01

ELSI-Net shows high agreement with expert raters and published LC atlases.

02

LC integrity differences in aging and Alzheimer's disease were successfully detected.

03

ELSI-Net's LC mask volume correlates with cerebrospinal fluid biomarkers of AD pathology.

Abstract

The locus coeruleus (LC) is linked to the development and pathophysiology of neurodegenerative diseases such as Alzheimer's disease (AD). Magnetic resonance imaging–based LC features have shown potential to assess LC integrity in vivo. We present a deep learning–based LC segmentation and feature extraction method called Ensemble‐based Locus Coeruleus Segmentation Network (ELSI‐Net) and apply it to healthy aging and AD dementia datasets. Agreement to expert raters and previously published LC atlases were assessed. We aimed to reproduce previously reported differences in LC integrity in aging and AD dementia and correlate extracted features to cerebrospinal fluid (CSF) biomarkers of AD pathology. ELSI‐Net demonstrated high agreement to expert raters and published atlases. Previously reported group differences in LC integrity were detected and correlations to CSF biomarkers were found.…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases4

Alzheimer's disease AD neurodegenerative diseases dementia

Figures7

Click any figure to enlarge with its caption.

Schematic illustration of the Ensemble‐based Locus Coeruleus Segmentation Network (ELSI‐Net) pipeline for automated LC analysis. The encircled red numbers indicate the three main changes compared to previous work 14 : (1) ensemble networks with majority vote, (2) additional intensity normalization prior to the segmentation step, (3) new reference region generation based on LC masks without requiring a pons segmentation. CR, contrast ratio; LC, locus coeruleus; MRI, magnetic resonance imaging; ROI, region of interest.

A, Healthy Aging Dataset (HAD). B, DZNE Longitudinal Cognitive Impairment and Dementia study (DELCODE) set. Box and swarm plots of Dice similarity coefficient (DSC) agreement values of ELSI‐Net and the semiautomatic template registration‐based method (MT) 9 on Healthy Aging Dataset (A) as well as ELSI‐Net's DSC agreement on the DELCODE set (B) with respect to the manual expert segmentations by our experts Rater 1 (R1) and Rater 2 (R2). They each rated all of HAD, for which we show the agreements to each rater individually (blue and orange hue) and the inter‐rater agreement (IRA, green hue). Each expert rated a complimentary subset of the DELCODE study, so that we report the agreement to the respective available rater's mask. AD, Alzheimer's disease; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; HC, healthy control; MCI, mild cognitive impairment; SCD, subjective cognitive decline.

Examplary coronal slices of subjects with higher LC contrast (upper row: healthy subject, predicted SSIM (our image quality metric) is 0.8288) and low LC contrast (lower row: AD dementia subject, predicted SSIM is 0.7596). On the right, the segmentations of ELSI‐Net (red) are compared to the expert rating (blue). Overlap between both is indicated in green. On the high LC contrast sample (upper row) a DSC of 74.87% was achieved, while on the low LC contrast sample 61.32% were measured. AD, Alzheimer's disease; DSC, Dice similarity coefficient; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; LC, locus coeruleus; SSIM, structural similarity.

A, ELSI‐Net (red) and manual ratings (green), 92.14% DSC. B, ELSI‐Net (red) and the meta mask from Dahl et al.20 (green), 64.33% DSC. Overlay of the ELSI‐Net template from DELCODE with a template generated from manual ratings (A) and a published LC atlas, meta mask, that combines information from several other previously published LC atlases (B) in MNI space (0.5 mm). On the left, a 3D rendering of the masks from an example coronal slice overlaid on the MNI template is shown. On the right, the respective axial (top), coronal (middle), and sagittal (bottom) 2D slices are visualized. Red color indicates the ELSI‐Net template mask, green the respective other template, and the overlapping volumes are colored in yellow. The corresponding slices are indicated by red (axial), green (coronal), and yellow (sagittal) lines on the right sides, respectively. Mask agreements are provided (DSC). Note that the agreement between the manual template (green in A) and the meta mask (green in B) is 60.09% DSC. DELCODE, DZNE Longitudinal Cognitive Impairment and Dementia study; DSC, Dice similarity coefficient; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; LC, locus coeruleus; MNI, Montreal Neurological Institute.

A, Maximum CRs. B, Maximum subregional CRs. Box and swarm plots of maximum LC CRs (A) and subregional maximum LC CRs (B) in young (blue) and older (orange) subject groups from Healthy Aging Dataset. For (A) the values of the expert raters R1 and R2 are reported as well as those of the fully automatic ELSI‐Net. In plot (B) only the ELSI‐Net results are shown. Significant differences are indicated by *p < 0.05 and p < 0.01 using a two‐tailed t test. CR, contrast ratio; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; LC, locus coeruleus.

A, Median CRs. B, Maximum CRs. C, LC signal volume in mm3. D, LC signal length in mm. Box and swarm plots of selected features of the manual rating (left) and ELSI‐Net (right) on the healthy (blue), SCD (orange), MCI (green), and AD dementia (red) subject groups of the DELCODE set. Significant differences are indicated by *p < 0.05 and p < 0.01 encoding the Tukey post hoc test result of the respective one‐way ANOVA (with p < 0.05). AD, Alzheimer's disease; CR, contrast ratio; DELCODE, DZNE Longitudinal Cognitive Impairment and Dementia study; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; HC, healthy control; LC, locus coeruleus; MCI, mild cognitive impairment; SCD, subjective cognitive decline.

A, TTau, p = 0.041. B, PTau181, p = 0.027. C, Aβ42/Aβ40, p = 0.018. D, Aβ42/PTau181, p = 0.002. Pearson r correlations (conditioned on variables age and sex) between LC signal volume obtained with ELSI‐Net and cerebrospinal fluid (CSF) measures of AD pathology. Scatterplots and value distributions are visualized. Correlation coefficient (r) and p value are reported. Aβ, amyloid beta; AD, Alzheimer's disease; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; LC, locus coeruleus; PTau, phosphorylated tau; TTau, total tau.

Tables1

TABLE 1. Pearson r correlation of the image quality measured as predicted structural similarity (SSIM) of DELCODE and the specified other values: The DSC agreement of the ELSI‐Net and expert's masks and the absolute differences of several extracted LC signal features.

Value name	r	p
DSC(ELSI‐Net, experts)	0.239***	9.546E‐4
\|(medianCR_ELSI‐Net—medianCR_experts)\|	−0.255***	4.098E‐4
\|(maximumCR_ELSI‐Net—maximumCR_experts)\|	−0.401***	1.132E‐8
\|(VOL_ELSI‐Net—VOL_experts)\|	−0.084	0.249
\|(LEN_ELSI‐Net—LEN_experts)\|	−0.008	0.908

Funding7

—Federal State of Saxony‐Anhalt, Germany
—Deutsche Forschungsgemeinschaft 10.13039/501100001659
—DFG, Sonderforschungsbereich (SFB)
—Bundesministerium für Bildung und Forschung 10.13039/501100002347
—EU Joint Programme—Neurodegenerative Disease Research (JPND)
—Alzheimer's Research UK (ARUK)
—National Institutes of Health 10.13039/100000002

Keywords

biomarkerdeep learninglocus coeruleusmagnetic resonance imagingsegmentation

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlzheimer's disease research and treatments · Dementia and Cognitive Impairment Research · Neurological Disease Mechanisms and Treatments

Full text

BACKGROUND

1

The locus coeruleus (LC) is a small brainstem nucleus with ≈ 50.000 pigmented neurons,1 but projects to almost all major brain regions and is the primary source of noradrenaline in the brain. It has been identified as one of the earliest brain structures to be affected in Alzheimer's disease (AD)2 and has been linked to cognitive decline in healthy aging and progression of AD.3 Owing to the early tau aggregation in the LC, assessing the integrity of the LC using structural magnetic resonance imaging (MRI) may be a suitable tool for obtaining pathophysiological insights in vivo.4, 5 It presents an opportunity to gain insights into cognitive and behavioral symptoms instrumental for developing effective treatments,1 improve our understanding of AD pathogenesis, and facilitate the development of disease‐modifying noradrenergic drugs.6

Specific MRI techniques permit the in vivo visualization of the LC by exploiting among others the magnetic properties of its neuromelanin pigmented neurons, although the exact contrast mechanisms remain unclear. We refer the reader to Betts et al.4 and Trujillo et al.,7 where more nuanced discussions concerning the LC MRI signal can be found. The hyperintense regions appearing in the MRI acquisitions were shown to correspond to LC properties observed in post mortem studies, that is, with respect to anatomical position and dimensions and LC cell density8 and to correspond with age‐related increases in neuromelanin.9 Furthermore, associations between LC MRI contrast and AD biomarkers10 as well as cognitive decline in health and disease have also been observed5 suggesting these MRI techniques may be suitable for assessing LC integrity.

A reliable extraction of in vivo LC MRI biomarkers requires a robust segmentation approach. The small size and cylindrical shape of the LC (≈ 2 mm in diameter11) together with the comparatively coarse resolution of MRI acquisitions limits the reliability of LC integrity measures. Although reasonable compromises can be found,4 LC MRI acquisitions are characterized by low signal‐to‐noise ratios and ambiguous structural boundaries, posing challenges for segmentation approaches. This is evident by particularly low inter‐rater agreements between manual raters of 0.499,12 0.54 to 0.64,13 and 0.67 Dice similarity coefficient (DSC)14 as reported in the literature.

Initially, a broad majority of studies investigating the LC using specific MRI relied on manual segmentations carried out by expert raters.15 They require considerable amounts of manual labor and the time of trained experts. In recent years, (semi‐)automatic approaches that reduce the need for manual intervention have become more common than purely manual segmentation. Many studies have used atlas‐ or template‐based segmentation approaches.8, 12, 16, 17 Another type of algorithm uses the template registration as a first step to obtain a search space on which further operations, usually related to peak intensity extraction of certain rostrocaudal subparts or slices of the LC, are applied.18, 19 Although only a subset of the LC voxels are obtained, promising results using intensity‐based features have been shown with respect to reproducing known cohort effects, such as structural degeneration,19 associations to cognition, or both.20 These methods require manual corrections in some cases as the performance relies on successful registration with a high precision. Through the remaining manual steps, rater bias may still influence the segmentation of these methods. Automatic LC segmentation algorithms, such as the approach presented here, may facilitate using LC imaging in large‐scale studies by removing the need for multiple experts to invest time and manual labor on LC segmentation and introduce more objectivity by removing human bias. Our group was the first to develop a deep learning–based LC segmentation approach.21 Its convolutional neural networks process the MRI acquisitions inherently faster than registration‐based and manual methods and have shown higher objectivity by incorporating multiple experts’ knowledge into the training.14 Our pipeline comprises all steps from the LC segmentation to the reference region generation and the feature extraction. We do not manually correct the segmentations prior to our analyses and evaluations.

In the work presented here, we show an improved fully automatic LC segmentation pipeline that further increases the performance compared to our previous approach and assess its practical usability in various ways on subjects of aging and AD dementia.

METHODS

2

We apply an improved version of a recently proposed fully automatic LC analysis method14 to two different datasets comprising MRI acquisitions from healthy aging and AD dementia. The results are compared to manual expert segmentations and their features.

Datasets

2.1

The two datasets share the same acquisition protocol: They comprise T_1_‐weighted fast low angle shot (FLASH) 3 Tesla MRI scans (5.56 ms echo time, 20 ms repetition time, 23° flip angle, 130 Hz/pixel bandwidth, 7/8 partial Fourier, 13:50 minute scan time) with an isotropic resolution of 0.75 mm. The image data were upsampled using a sinc filter to achieve an isotropic voxel size of 0.375 mm and then bias field–corrected as previously described.9

Our Healthy Aging Dataset (HAD)9 comprised 82 healthy subjects. There were 25 younger (22–30 years old; 13 male) and 57 older subjects (61–80 years old, 19 male). We also analyzed the T_1_‐weighted FLASH MRI data from the DZNE Longitudinal Cognitive Impairment and Dementia study (DELCODE).22 This comprised 188 subjects: 68 healthy elderly adults, 22 relatives of individuals with AD, 61 subjects with subjective cognitive decline (SCD), 26 with mild cognitive impairment (MCI), and 11 with AD dementia. For our experiments, we combine the healthy elderly adults and relatives of AD subjects into one group of healthy controls.

RESEARCH IN CONTEXT

Systematic review: The authors reviewed the literature using traditional sources (e.g., PubMed, Google Scholar). Although there are several publications introducing semi‐automatic methods for locus coeruleus (LC) segmentation, the application of deep learning methods is underexplored. To the best of our knowledge, this is the first paper using a deep learning–based approach for automated LC segmentation in Alzheimer's disease (AD) dementia.
Interpretation: Our work introduces and evaluates an improved automatic, deep learning–based LC segmentation and analysis approach. The results suggest a very high potential for practical applicability, for example, in large‐scale clinical studies for neurodegenerative diseases.
Future directions: Ensemble‐based Locus Coeruleus Segmentation Network (ELSI‐Net) can be used to assess LC integrity on large‐ or small‐scale studies in AD dementia. To ensure robust performance, ELSI‐Net should be further evaluated in larger, more diverse datasets comprising varying LC magnetic resonance imaging protocols and clinical populations.

The age of the subjects ranges from 60 to 87 years (69 on average, 102 females). The LC MRI data have been acquired at four different sites across Germany: Magdeburg, Rostock, Bonn, and Berlin. We refer the reader to Betts et al.9 (for HAD) and Jessen et al.22 for further details on the full DELCODE cohort.

Segmentation methods

2.2

Manual expert segmentation

2.2.1

Both of our trained expert raters (M. Betts, referred to as Rater 1 [R1] and M. Sarkar, referred to as Rater 2 [R2]) manually segmented all LCs in the HAD. For the DELCODE dataset, R1 manually segmented 108 subjects and R2 segmented the remaining 80 subjects.

The raters delineated the LC using ITK‐SNAP23 as previously described.9 Briefly, the segmentation was performed on the axial slices starting at the most dorsal to ventral portion of the LC while limiting the rostrocaudal space for segmentation to slices between the inferior boundary of the interpeduncular fossa at the level of the inferior colliculus and the superior cerebellar peduncle (for reference see Betts et al.9).

Ensemble‐based Locus Coeruleus Segmentation Network

2.2.2

Our previously published approach14 for fully automatic LC segmentation was further improved and used in this work. Figure 1 shows a schematic overview of the method comprising two fundamental steps that are realized using two almost identical 3D U‐Net based24 convolutional neural networks: initial LC localization followed by segmentation on an extracted patch containing only the LC and its immediate vicinity. To train and evaluate our model, we ran a 3 × 5‐fold nested cross‐validation which splits the subjects of the HAD dataset just as in our previous work14 making the results directly comparable. A final training was conducted splitting the HAD in five equally sized subsets to obtain the five nets (each trained with one of these subsets as validation set and the combined rest as training set) that were used for the application to DELCODE. There was no fine‐tuning or retraining with DELCODE subjects and they have not been used for either the training or validation sets (for early stopping), so that it can be seen entirely as a test set.

Schematic illustration of the Ensemble‐based Locus Coeruleus Segmentation Network (ELSI‐Net) pipeline for automated LC analysis. The encircled red numbers indicate the three main changes compared to previous work 14 : (1) ensemble networks with majority vote, (2) additional intensity normalization prior to the segmentation step, (3) new reference region generation based on LC masks without requiring a pons segmentation. CR, contrast ratio; LC, locus coeruleus; MRI, magnetic resonance imaging; ROI, region of interest.

Three changes were introduced to the method compared to our previous work.14 First, for the application of the models, we combined the five resulting networks in an ensemble and conducted an averaging and majority vote on the different outputs to determine the final predictions for the localization and segmentation nets, respectively. This way, the final result can profit from the information obtainable from the entire training set despite the necessity for a validation set for each individual network. Second, we normalized the intensities of the extracted image patch once again prior to passing it to the segmentation network aiming to reduce the variance of the intensity range. Third, we replaced the reference region generation relying on a sufficiently accurate pons segmentation. Instead, we determine an LC‐oriented orthonormal vector base forming a coordinate system and we calculate the average offset of the semi‐automatically generated reference regions on the training set in relation to the respective LCs. They are located in the pontine tegmentum—one per hemisphere. In the application case, we determine the same coordinate system and place the reference region according to the learned offset. This approach does not require a reliable pons segmentation or time‐consuming registration procedures and is potentially more robust to head rotations incurred during acquisitions. The vectors for this LC‐oriented coordinate system are determined as follows.

[eqn]

[eqn]

[eqn]

with v⃗1 the rostrocaudal LC direction derived from the two principal components obtained from two principal component analyses on the mask voxel coordinates of the LC masks of the left (l⃗1) and right hemisphere (r⃗1), the second base vector v⃗2, the center of mass of the left (c⃗l) and right LC (c⃗r), and the third base vector v⃗3.

Feature extraction

2.3

The rostral LC may be particularly vulnerable in AD.10 Hence, we extracted not only the entire LC MRI contrast ratios (CRs), but also subregional CRs and the LC mask's volume and length. Note that volume and length are estimated from the LC segmentation based solely on the in vivo MRI. Throughout this work, all reported features are bilateral, that is, they are the average of both LC hemispheres’ features.

The most frequently used LC feature in the literature15, 25 are MRI intensity ratios that calculate the ratio of the maximum or median intensity value of the voxels in the LC mask (LCmax or LCmedian) to the median value of a reference region (REFmedian), positioned in the pontine tegmentum. For example, the maximum CR is defined as follows:

[eqn]

We furthermore calculate the CRs of subregions of the LC by splitting it along its axial dimension into two and three sections equal of length.

The LC signal length was measured as the number of axial slices its mask was present in, converted to millimeters.

The volume of the LC signal was determined as the number of voxels in the mask and converted into cubic millimeters.

Experiments

2.4

We carried out the following experiments to assess the performance of Ensemble‐based Locus Coeruleus Segmentation Network (ELSI‐Net) in different ways. We compared its segmentations to manual expert ratings and published LC atlases, replicated subject group differences described in the literature, explored correlations of the automatically obtained LC MRI features to cerebrospinal fluid (CSF) biomarkers of AD pathology and assess the influence of acquisition‐related factors. For the statistical analyses we use Jasp26 as well as the scipy27 library.

Mask similarity

2.4.1

ELSI‐Net's performance is evaluated by determining the agreement of its results with manual subject‐wise expert segmentations. We measure the similarity of the masks with the commonly used DSC given by DSC(X,Y)=2|X∩Y||X|+|Y| with X and Y being sets of voxels belonging to the respective masks to be compared.

Furthermore, we compare the DSC agreement of the fully automated method, ELSI‐Net, to a previously published semi‐automatic approach that involves manually segmenting the LC on a study‐wise template image and transforming the resulting mask into the individual subject spaces. We abbreviate this method as MT and refer the reader to Betts et al.9 for further details.

Anatomical agreement

2.4.2

A template‐based comparison of ELSI‐Net's LC masks to previously published atlases and the masks of our expert raters allows us to assess ELSI‐Net's results with respect to their anatomical plausibility in terms of anatomical position and extent. To this end, we coregistered and morphed the upsampled FLASH scans of all healthy and MCI subjects together with their manual (R1) and ELSI‐Net LC segmentations to FSL standard 0.5 mm asymmetric Montreal Neurological Institute (MNI) space28 using Advanced Normalization Tools (ANTs)29 Syn registration with bspline interpolation. We then calculated a probabilistic mask for the ELSI‐Net and manual segmentations and binarized both using a 50% threshold. In total, eight subjects were excluded due to a failure of the registration process. We rendered the ELSI‐Net LC template alongside the template obtained from the manual ratings as well as a recently published LC atlas by Dahl et al. (so‐called meta mask20) that combines the information of several established LC atlases8, 9, 16, 18, 30, 31 and was brought into the same asymmetric MNI space of the other templates using Syn registration. We performed these steps analogously to Dahl et al.20 The visualization was carried out using 3D Slicer.32 Additionally, we determine the DSC agreement between the resulting template masks and calculate the agreement with the meta mask by Dahl et al. using the accuracy metric (mean of specificity and sensitivity) as previously described by the authors.20 The resulting agreement is compared to a number of established and publicly available LC atlases.

Feature‐based comparison of subject groups

2.4.3

We conducted significance tests, such as t tests (with preceding Levene tests for equality of variances) and one‐way analyses of variance (ANOVA) as well as Cohen d as an effect size measure.

Relationship between LC signal volume and CSF measures of AD pathology

2.4.4

Measurements of amyloid beta (Aβ42/Aβ40) and tau proteins (total tau [t‐tau], phosphorylated tau [p‐tau]181) in the CSF are established biomarkers of AD pathology. From 85 of the 188 DELCODE subjects CSF measures were obtained (AD dementia: 7, MCI: 21, SCD: 22, healthy controls: 35). We correlated the automatically derived LC features to CSF biomarkers of AD pathology using Pearson r correlations (conditioned on variables age and sex) performed in Jasp.26

The influence of image quality and acquisition site

2.4.5

We investigate the potential impact of motion artefacts on ELSI‐Net. To objectively assess the MRI image quality, we make use of the convolutional neural network based approach to motion artefact quantification as recently proposed.33 This network was trained to estimate the structural similarity (SSIM) of a single corrupted image slice to its (non‐existent) ideal, uncorrupted version. The resulting predicted/estimated SSIM thus quantifies the amount of corruption by motion artefacts, 1.0 encoding perfect image quality and 0.0 the worst. We applied the network to all acquisitions from both datasets by slice‐wise processing. We chose the minimum predicted SSIM out of all slices of an acquisition as its image quality score.

Apart from reporting the overall image quality of the used datasets, we investigate the relation of image quality to the DSC agreement between ELSI‐Net and the expert's masks as well as to the disagreement in terms of resulting features (measured as the absolute difference between extracted features such as median and maximum CRs and LC signal volume of the two segmentation approaches). Finally, we assess the influence of the acquisition site on the agreement between ELSI‐Net and the expert rating as well as on the LC CR features per se.

RESULTS

3

ELSI‐Net was applied to all subjects in the previously described fashion and produced valid output, that is, segmentation masks and the specified LC features for all subjects in both datasets.

Mask similarity

3.1

HAD

3.1.1

The HAD was rated by both expert raters with an inter‐rater agreement of 67.58% ± 8.90% (mean ± standard deviation [SD]) DSC when averaging the left and right LC. The left plot in Figure 2 shows that both our automatic method (73.19% ± 7.75%) as well as the semi‐automatic template registration based approach (MT;9 74.00% ± 15.56%) perform comparably to the inter‐rater agreement in terms of mask similarity when using R1's segmentations as the reference. However, the values derived using ELSI‐Net are subject to a substantially lower SD (almost half of the semi‐automated method). ELSI‐Net does not show a difference in its agreement to either expert rater, while with MT a strong decline of mask agreement with respect to R2's segmentations compared to those of R1 is apparent.

A, Healthy Aging Dataset (HAD). B, DZNE Longitudinal Cognitive Impairment and Dementia study (DELCODE) set. Box and swarm plots of Dice similarity coefficient (DSC) agreement values of ELSI‐Net and the semiautomatic template registration‐based method (MT) 9 on Healthy Aging Dataset (A) as well as ELSI‐Net's DSC agreement on the DELCODE set (B) with respect to the manual expert segmentations by our experts Rater 1 (R1) and Rater 2 (R2). They each rated all of HAD, for which we show the agreements to each rater individually (blue and orange hue) and the inter‐rater agreement (IRA, green hue). Each expert rated a complimentary subset of the DELCODE study, so that we report the agreement to the respective available rater's mask. AD, Alzheimer's disease; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; HC, healthy control; MCI, mild cognitive impairment; SCD, subjective cognitive decline.

DELCODE dataset

3.1.2

Across almost all subject groups from DELCODE, ELSI‐Net shows relatively high agreement with a manual expert rating (see Figure 2B). The mean DSC consistently exceeds 70% and the SDs are in a range of 6.4% to 10.2%, which is comparable to the inter‐rater agreement measured on HAD. The comparatively small group of AD dementia subjects however constitutes the exception with lower agreement and a larger SD in the DSC values, although the median with 67.86% is in the range of the inter‐rater agreement. Figure 3 provides a qualitative visualization of two, one healthy and one AD dementia group subject.

Examplary coronal slices of subjects with higher LC contrast (upper row: healthy subject, predicted SSIM (our image quality metric) is 0.8288) and low LC contrast (lower row: AD dementia subject, predicted SSIM is 0.7596). On the right, the segmentations of ELSI‐Net (red) are compared to the expert rating (blue). Overlap between both is indicated in green. On the high LC contrast sample (upper row) a DSC of 74.87% was achieved, while on the low LC contrast sample 61.32% were measured. AD, Alzheimer's disease; DSC, Dice similarity coefficient; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; LC, locus coeruleus; SSIM, structural similarity.

Anatomical agreement

3.2

Figure 4 depicts a template generated from the ELSI‐Net results from DELCODE overlaid with a template from the manual ratings as well as a previously published LC atlas (meta mask) from Dahl et al.20 ELSI‐Net's agreement to the manual rating's template is very high (92.14% DSC, see Figure 4A). It confirms a good overall agreement to the manual ratings of both raters that was already suggested by the subject‐level mask agreement evaluation. Only a very slight discrepancy is visible: the ELSI‐Net template appears slightly shifted toward the rostral direction compared to the manual rating's template. There are, however, more deviations observable when comparing it to the meta mask (see Figure 4B), which comprises information of multiple other published atlases derived from different MRI acquisitions and modalities. The meta mask is located more rostral than the ELSI‐Net template and therefore the template of the manual ratings as well. ELSI‐Net's agreement with respect to the meta mask (64.33% DSC) exceeds the agreement of the manual rating's template (60.09% DSC) with the meta mask by > 4% DSC.

A, ELSI‐Net (red) and manual ratings (green), 92.14% DSC. B, ELSI‐Net (red) and the meta mask from Dahl et al.20 (green), 64.33% DSC. Overlay of the ELSI‐Net template from DELCODE with a template generated from manual ratings (A) and a published LC atlas, meta mask, that combines information from several other previously published LC atlases (B) in MNI space (0.5 mm). On the left, a 3D rendering of the masks from an example coronal slice overlaid on the MNI template is shown. On the right, the respective axial (top), coronal (middle), and sagittal (bottom) 2D slices are visualized. Red color indicates the ELSI‐Net template mask, green the respective other template, and the overlapping volumes are colored in yellow. The corresponding slices are indicated by red (axial), green (coronal), and yellow (sagittal) lines on the right sides, respectively. Mask agreements are provided (DSC). Note that the agreement between the manual template (green in A) and the meta mask (green in B) is 60.09% DSC. DELCODE, DZNE Longitudinal Cognitive Impairment and Dementia study; DSC, Dice similarity coefficient; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; LC, locus coeruleus; MNI, Montreal Neurological Institute.

Other established LC atlases show agreements ranging from 63% to 47% to the meta mask as measured by the accuracy metric specified by the authors20 (mean of sensitivity and specificity). We calculated the same metric and found ELSI‐Net's template to achieve 66.62% using this measure, which demonstrates a comparatively high agreement with the meta mask. It also exceeds the manual rating's template score, which amounts to 63.86%.

Feature‐based comparison of subject groups

3.3

We obtained the previously described LC features (see section 2.3) from HAD and DELCODE using ELSI‐Net and report the resulting distributions here.

HAD

3.3.1

We identified significant differences in LC features between young and older subject groups. The maximum CR and particularly the rostral maximum subregional CR halves and thirds show age‐related increases in LC intensity on HAD. Figure 5 visualizes the resulting distributions for these features and groups. We determined Cohen d for estimating the effect size of the difference between young and older subjects and found that with maximum CRs R1, R2, and ELSI‐Net resulted in 0.489, 0.596, and 0.662, respectively. Furthermore, all of these differences in the maximum CR were found to be statistically significant (e.g., ELSI‐Net maximum CR: Student t test [2.759, p =0.007], Levene test for equality of variances [1.857, P=0.177]).

*A, Maximum CRs. B, Maximum subregional CRs. Box and swarm plots of maximum LC CRs (A) and subregional maximum LC CRs (B) in young (blue) and older (orange) subject groups from Healthy Aging Dataset. For (A) the values of the expert raters R1 and R2 are reported as well as those of the fully automatic ELSI‐Net. In plot (B) only the ELSI‐Net results are shown. Significant differences are indicated by *p < 0.05 and *p < 0.01 using a two‐tailed t test. CR, contrast ratio; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; LC, locus coeruleus.

When inspecting the subregional LC CRs, it becomes evident that the age‐related effect appears to be stronger in the medial and rostral LC parts. We find an increasing effect size measured by Cohen d from 0.374 (caudal) to 0.489 (medial) and 0.705 (rostral) for the maximum subregional CRs (splitting LC in thirds of equal length) determined by ELSI‐Net.

DELCODE set

3.3.2

Figure 6 visualizes the subject group distributions of four LC features obtained with ELSI‐Net and compares them to the manual rating's distributions.

*A, Median CRs. B, Maximum CRs. C, LC signal volume in mm3. D, LC signal length in mm. Box and swarm plots of selected features of the manual rating (left) and ELSI‐Net (right) on the healthy (blue), SCD (orange), MCI (green), and AD dementia (red) subject groups of the DELCODE set. Significant differences are indicated by *p < 0.05 and *p < 0.01 encoding the Tukey post hoc test result of the respective one‐way ANOVA (with p < 0.05). AD, Alzheimer's disease; CR, contrast ratio; DELCODE, DZNE Longitudinal Cognitive Impairment and Dementia study; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; HC, healthy control; LC, locus coeruleus; MCI, mild cognitive impairment; SCD, subjective cognitive decline.

A trend of decreasing LC signal volume and length as determined by ELSI‐Net appears with increasing clinical severity of the subject groups, which is not present in the expert's LC signal volume and length measurements. The visible decrease in signal volume in the AD dementia group (see Figure 6C) bears resemblance to the trend measured by the median CR with the expert ratings. A one‐way ANOVA confirmed the statistical significance of this decrease (healthy controls vs. AD dementia: F=5.258, P=0.002 and post hoc PTukey=0.006). We found Cohen d for the difference in signal volume between the healthy group and the AD dementia subjects to be 1.089 with ELSI‐Net, which is comparable to the effect size found with the manual rating and the median CR group difference (Cohen d: 0.993, one‐way ANOVA: F=3.575, P=0.015 and post hoc PTukey=0.016). Similar to LC signal volume, the LC signal length feature measured by ELSI‐Net was also significantly decreased in AD dementia (one‐way ANOVA: 4.084, P=0.008 and post hoc PTukey=0.009, Cohen d: 1.036; see Figure 6D).

The group differences in the CRs using ELSI‐Net were not impacted by the choice of the reference region generation approach. Significant group differences were neither identified with the semi‐automatically nor with the automatically generated reference regions.

Relationship between LC signal volume and CSF measures of AD pathology

3.4

Motivated by the observed decreases in LC signal volume in MCI and AD dementia subjects measured by ELSI‐Net, we correlate this feature to all available CSF measures of AD pathology. Several significant correlation results were found. They are reported in Figure 7. Although the correlations are weak, they show decreased LC signal volume is associated with higher tau and amyloid pathology. We found similar correlations between LC signal length and amyloid pathology (LC signal length and Aβ42/Aβ40: r=0.217, P=0.049; LC signal length and Aβ42/p−tau181: r=0.296, P=0.007), but no correlation with LC CR features.

A, TTau, p = 0.041. B, PTau181, p = 0.027. C, Aβ42/Aβ40, p = 0.018. D, Aβ42/PTau181, p = 0.002. Pearson r correlations (conditioned on variables age and sex) between LC signal volume obtained with ELSI‐Net and cerebrospinal fluid (CSF) measures of AD pathology. Scatterplots and value distributions are visualized. Correlation coefficient (r) and p value are reported. Aβ, amyloid beta; AD, Alzheimer's disease; ELSI‐Net, Ensemble‐based Locus Coeruleus Segmentation Network; LC, locus coeruleus; PTau, phosphorylated tau; TTau, total tau.

The influence of image quality and acquisition site

3.5

Using a recently proposed method for the quantification of motion artefacts,33 we conducted the previously described experiments to estimate the influence of acquisition artefacts on the LC metrics.

Image quality of the datasets

3.5.1

When comparing the image quality of the two datasets, it becomes apparent that the DELCODE dataset (SSIM mean: 0.794, SD: 0.058) shows lower quality than HAD (SSIM mean: 0.827, SD: 0.035; Welch t test: –5.874, P=1.394E−8, Levene test for equality of variances (11.505, P=7.987E−4)).

A one‐way ANOVA showed no significant differences between DELCODE subject groups (1.629, P=0.184). Nonetheless, a coincidence of slightly decreasing image quality with increasing clinical severity is imminent in our particular dataset. The SSIM means of the healthy control, SCD, MCI, and AD dementia groups are 0.795, 0.802, 0.784, and 0.764, respectively. This motivates further investigation of a potential influence of image quality on subject group differences.

Correlation of image quality with segmentation performance and feature deviation

3.5.2

We computed several image quality–related Pearson r correlations (see Table 1). One of them is between ELSI‐Net's mask similarity to the manual expert rating on DELCODE quantified in terms of DSC and the measured image quality (predicted SSIM) of the samples. Although the correlation is significant it does not appear strong (r=0.239, P<0.001). However, a modest correlation between image quality and absolute differences in maximum CR between the two segmentation approaches was observed (r=−0.401, P<0.001). It indicates a correlation between image quality and agreement between the CRs of ELSI‐Net and manual ratings so that with increasing image quality, there is greater agreement between the two segmentation approaches. This correlation is much weaker for median LC CRs (r=−0.255, P<0.001) and was not observed for LC signal volume feature extraction indicating LC signal volume may be influenced less by image quality.

Assessment of acquisition site effects

3.5.3

Several one‐way ANOVAs were carried out on DELCODE to gain insights into the potential influence of the acquisition sites. We found significant image quality (measured by predicted SSIM) differences between sites (one‐way ANOVA: 13.274, P=4.121E−6; one of the four sites was excluded since it contributed only one of the subjects).

By means of further one‐way ANOVAs, we found no significant site effects on mask agreement between ELSI‐Net and the expert rating (measured by DSC) as well as agreement on median and maximum CRs. This indicates that the agreement between ELSI‐Net and manual expert ratings is not affected by the acquisition site. However, site‐related differences could be identified in all of the CR and subregional CR features themselves (e.g., median CR, one‐way ANOVA 26.809, P=6.057E−11). In contrast, ELSI‐Net estimates of LC signal volume and length are not subject to site effects.

DISCUSSION

4

Here, we report an improved fully automatic segmentation and feature extraction method for in vivo assessment of the LC. This method comprises an ensemble approach to applying the neural networks for LC localization and segmentation. After the initial localization, an additional intensity normalization step using a local patch surrounding the LC is applied. We propose a method for the generation of a reference region removing the need for an additional segmentation of the pons region. This approach is validated on datasets with a consistent FLASH LC MRI protocol comprising healthy young and older adults as well as in adults on the AD dementia spectrum to assess its correspondence with previously published LC atlases and clinical observations.

The proposed changes, most notably the addition of the ensemble‐based inferences, led to performance improvements in terms of DSC agreement compared to our previous work,14 in which the same nested cross‐validation evaluation scheme, data splits, and dataset (HAD) were used.

In healthy aging, ELSI‐Net was able to segment the LC with very high accuracy performing equal or better than an expert manual rater. In contrast to a previously published semi‐automatic approach,9 ELSI‐Net exceeds the inter‐rater agreement in terms of DSC with respect to both raters (compare ‘ELSI’ and ‘MT’ in Figure 2A). It is therefore arguably more objective than both the semi‐automatic segmentation method and a single expert rater. A higher Cohen d value with respect to age‐related differences between young and older adults shows that ELSI‐Net could detect these differences reliably and with increased sensitivity compared to a manual segmentation approach. ELSI‐Net was also able to replicate previously observed age‐related increases in rostral and middle LC contrast using the same dataset.9 ELSI‐Net may potentially be deployed across sites to further reduce rater bias of LC analyses and increase comparability of studies using similar subjects and coherent FLASH MRI protocols, for example, to determine normative feature ranges of a healthy LC given a specific age. Indeed, acquisition site–related influences on the LC features were found independently of the segmentation approach and have to be considered.

In a clinical cohort of individuals with AD dementia from DELCODE, ELSI‐Net could effectively segment the LC without any fine‐tuning, solely being trained on the healthy aging dataset. For the most part (including the MCI subject group) a satisfactory agreement measured by DSC compared to the experts’ rating was achieved. The AD dementia group was a noticeable exception, although it was the smallest group with only 11 participants and further analyses are required in larger cohorts of individuals with AD dementia to comprehensively determine its segmentation accuracy.

The overall anatomical plausibility of the automatically obtained LC masks and a performance comparable to those of experts is indicated by the very high agreement between the LC template of the ELSI‐Net DELCODE masks and the template generated from the manual ratings, but also a meta mask comprising a number of previously published atlases.20 ELSI‐Net's template shows the highest agreement with the meta mask among all published atlases that were used for the meta mask creation. This indicates that ELSI‐Net can generate anatomically precise LC segmentations in the context of the described FLASH MRI protocol removing the need for semi‐automatic or manual segmentation. With ELSI‐Net we found significant differences between healthy controls and subjects with AD dementia with respect to LC signal volume and length but not median LC contrast, as observed with the expert ratings. This could indicate a deviation of segmentation style between ELSI‐Net and manual LC segmentation. It is conceivable that ELSI‐Net is differentially influenced by a reduction in LC MRI contrast present in AD dementia subjects compared to expert raters that rely more on anatomical prior knowledge. ELSI‐Net was only trained on LC segmentations from healthy young and older adults. Because it has never seen the variance introduced by AD dementia during training, the lower LC contrast or additional data characteristics unknown to us may yield smaller LCs in ELSI‐Net's assessment in AD. This would explain the lower DSC agreement between ELSI‐Net and manual raters observed in the AD dementia group as well as the deviations seen in LC CRs between ELSI‐Net and manual rating on DELCODE. Therefore LC signal volume and length estimates using ELSI‐Net might be more accurate in clinical cohorts due to reduced human bias that manual raters may be prone to. The examples shown in Figure 3 indicate the shape and size variance in ELSI‐Net's segmentation, which appears to be more directly influenced by local image properties. Hence we presume ELSI‐Net is unlikely to merely produce a learned average LC mask. It should be noted that ELSI‐Net may not only rely on intensity, but on additional characteristics such as the surrounding anatomy and shape of the LC during the segmentation procedure.

We further investigated the influence of image quality on our (automatic) LC analysis in multiple ways. We found significant correlations between image quality and agreement on CR‐based features (as quantified by absolute difference between ELSI‐Net and the experts’ rating in particular for maximum intensity CRs). However, no such correlation was observed with respect to ELSI‐Net's LC signal volume measure, where we observed most pronounced differences between healthy controls and AD dementia.

We found significant acquisition site–related effects on the image quality and CR features in general. No influence of site could be found on the agreement between ELSI‐Net and the experts’ rating with respect to mask and CR feature agreement as well as ELSI‐Net's measurements of LC signal volume and length indicating a robust performance across multiple sites.

As an additional validation step, we further assessed how LC segmentations generated by ELSI‐Net are related to previously reported associations with AD pathology. In a subset of subjects with known CSF status from DELCODE, we found reduced LC signal volume was significantly associated with increased tau and amyloid pathology in agreement with previous findings.10, 34, 35, 36

Limitations

4.1

An important caveat in our analyses is the rather small number of AD dementia subjects (n=11) in our clinical cohort of 188 participants. Further evaluations on more datasets with larger groups of AD dementia subjects, ideally with amyloid and tau pathology biomarkers, are required to ascertain the performance of ELSI‐Net more conclusively in this population. Another interesting aspect that was left unexplored is the robustness of ELSI‐Net with respect to acquisition parameters and differing MRI protocols. To this end, more extensive evaluation work on a variety of datasets comprising diverse anisotropic resolutions, differing LC MRI sequences such as turbo spin echo and magnetization transfer–based acquisitions need to be carried out, and ELSI‐Net's performance on slab acquisitions needs to be investigated. We aim to conduct these experiments, publish them, and release ELSI‐Net as an easy‐to‐use Docker in the near future. Of course, a necessary requirement for the application of ELSI‐Net is the acquisition of specific sequences with LC contrast in general. Recently, automatic methods were applied to LC segmentation in acquisitions without LC contrast as an alternative to atlas‐based approaches.37 While inherently lacking precision and the possibility to extract LC contrast, signal volume, or length features, they may allow functional MRI or diffusion MRI analyses in datasets in which LC MRI was not acquired.

Finally, the image quality assessed here was based on the quality of the whole‐brain acquisition and was not LC specific, which may not reflect the quality of the visualization of the LC and its immediate vicinity.

CONCLUSION

5

In this work, we evaluate an improved version of a previously proposed, fully automatic approach to LC segmentation and feature extraction termed ELSI‐Net. Evaluation on LC imaging data acquired from young and older adults but also from subjects across the AD dementia continuum show that ELSI‐Net reliably generates anatomically plausible results with excellent agreement to established LC atlases given a consistent FLASH LC MRI protocol. We found increased objectivity with ELSI‐Net compared to single expert raters and a semi‐automatic LC segmentation method. LC features by ELSI‐Net demonstrate high sensitivity replicating previously shown subject group differences in healthy aging and AD dementia. We saw correlations of LC signal volume measured by ELSI‐Net to tau and amyloid pathology and robust performance with respect to data acquired across multiple sites.

ELSI‐Net provides a means to automatically segment the LC with high accuracy particularly in aging cohorts in the context of the FLASH LC MRI protocol. Further analyses are required to determine its effectiveness in segmenting the LC in different LC MRI contrasts; longitudinal datasets; and additional clinical cohorts, for example including subjects with Parkinson's disease, depression, and further neurological disorders.

CONFLICT OF INTEREST STATEMENT

The authors declare no potential conflicts of interests. Author disclosures are available in the supporting information.

Supporting information

Supporting Information

Bibliography37

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Theofilas P , Ehrenberg AJ , Dunlop S , et al. Locus coeruleus volume and cell population changes during Alzheimer's disease progression: a stereological study in human postmortem brains with potential implication for early‐stage biomarker discovery. Alzheimer's & Dementia. 2017;13(3):236‐246. http://www.sciencedirect.com/science/article/pii/S 1552526016326796. doi:10.1016/j.jalz.2016.06.2362 PMC 529894227513978 · doi ↗ · pubmed ↗
2Braak H , Del Tredici K . Where, when, and in what form does sporadic Alzheimer's disease begin?. Curr Opin Neurol. 2012;25(6):708‐714. doi:10.1097/WCO.0b 013e 32835 a 3432 23160422 · doi ↗ · pubmed ↗
3Mather M , Harley CW . The Locus Coeruleus: essential for Maintaining Cognitive Function and the Aging Brain. Trends Cogn Sci. 2016;20(3):214‐226. https://www.cell.com/trends/cognitive‐sciences/abstract/S 1364‐6613(16)00018‐8. doi:10.1016/j.tics.2016.01.001 26895736 PMC 4761411 · doi ↗ · pubmed ↗
4Betts MJ , Kirilina E , Otaduy MCG , et al. Locus coeruleus imaging as a biomarker for noradrenergic dysfunction in neurodegenerative diseases. Brain. 2019;142(9):2558‐2571. doi:10.1093/brain/awz 193 31327002 PMC 6736046 · doi ↗ · pubmed ↗
5Krohn F , Lancini E , Ludwig M , et al. Noradrenergic neuromodulation in ageing and disease. Neurosci Biobehav Rev. 2023;152:105311. https://www.sciencedirect.com/science/article/pii/S 0149763423002804. doi:10.1016/j.neubiorev.2023.105311 37437752 · doi ↗ · pubmed ↗
6Galgani A , Giorgi FS . Exploring the Role of Locus Coeruleus in Alzheimer's Disease: a Comprehensive Update on MRI Studies and Implications. Curr Neurol Neurosci Rep. 2023;23(12):925‐936. doi:10.1007/s 11910-023-01324-9 38064152 PMC 10724305 · doi ↗ · pubmed ↗
7Trujillo P , Aumann MA , Claassen DO . Neuromelanin‐sensitive MRI as a promising biomarker of catecholamine function. Brain. 2023:awad 300. doi:10.1093/brain/awad 300 PMC 1083426237669320 · doi ↗ · pubmed ↗
8Keren NI , Lozar CT , Harris KC , Morgan PS , Eckert MA . In vivo mapping of the human locus coeruleus. Neuroimage. 2009;47(4):1261‐1267. https://linkinghub.elsevier.com/retrieve/pii/S 1053811909006235. doi:10.1016/j.neuroimage.2009.06.012 19524044 PMC 3671394 · doi ↗ · pubmed ↗