A first-of-its-kind two-body statistical shape model of the arthropathic shoulder: enhancing biomechanics and surgical planning

Justin Blackman; Joshua W. Giles

PMC · DOI:10.1186/s13018-025-05855-4·June 3, 2025

A first-of-its-kind two-body statistical shape model of the arthropathic shoulder: enhancing biomechanics and surgical planning

Justin Blackman, Joshua W. Giles

PDF

Open Access

TL;DR

This study creates a new statistical model of the shoulder that captures how arthropathic bones change together, improving biomechanical analysis and surgical planning.

Contribution

The first two-body statistical shape model of the arthropathic shoulder using clinical data, capturing coupled bone variations.

Findings

01

The combined model captured 43.2% of individual shape variability between the scapula and proximal humerus.

02

The model outperformed single-body models in generating realistic joint-level anatomy with reduced prediction bias.

03

Leave-one-out cross-validation errors were comparable to prior studies, validating the model's accuracy.

Abstract

Statistical Shape Models are machine learning tools in computational orthopedics that enable the study of anatomical variability and the creation of synthetic models for pathogenetic analysis and surgical planning. Current models of the glenohumeral joint either describe individual bones or are limited to non-pathologic datasets, failing to capture coupled shape variation in arthropathic anatomy. We aimed to develop a novel combined scapula-proximal-humerus model applicable to clinical populations. Preoperative computed tomography scans from 45 Reverse Total Shoulder Arthroplasty patients were used to generate three-dimensional models of the scapula and proximal humerus. Correspondence point clouds were combined into a two-body shape model using Principal Component Analysis. Individual scapula-only and proximal-humerus-only shape models were also created for comparison. The models were…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Diseases2

erosions humeral head

Figures20

Click any figure to enlarge with its caption.

Generalisation Ability across Number of Principal Components for each SSM. (A) The distribution of distances between model outputs and the target shape point clouds (every odd Number of Principal Components shown), box plots denote observation quartiles and extremes and white circles denote distribution averages; and (B) The incremental decrease in mean average target shape point cloud distance for each additional Principal Component included in the SSM

Specificity across Number of Principal Components for each SSM. (A) The distribution of distances between model outputs and the closest training set shape point clouds (every odd Number of Principal Components shown), box plots denote observation quartiles and extremes and white circles denote distribution averages; and (B) The incremental increase in mean average target shape point cloud distance for each additional Principal Component included in the SSM

Compactness of each SSM. The number of principal components required to capture at least 90% of the total data set variation for each shape is designated by the vertical dashed lines and color-coordinated x-axis tick labels

Primary 6 Modes of Variation of the Scapula SSM, Lateral Oblique View. The extent of point displacement along the mode of variation is encoded in marker size (small to large) and colour (blue through yellow to red)Fig. 14Primary 6 Modes of Variation of the Proximal-Humerus SSM, Medial View. The extent of point displacement along the mode of variation is encoded in marker size (small to large) and colour (blue through yellow to red)Fig. 15Primary 6 Modes of Variation of the Combined SSM, Lateral Oblique View. The extent of point displacement along the mode of variation is encoded in marker size

Scapula-only SSM training set average point cloud-lateral, anterior, posterior, and superior viewsFig. 6Proximal-humerus-only SSM training set average point cloud-medial, anterior, posterior, and superior viewsFig. 7Combined two-body SSM training set average point cloud-anterior, posterior, and superior views

Decomposition of the modes of variation of the combined SSM into single-body SSM principal components

Single-body variation and cumulative coupled variation in each combined SSM principal component

Relative magnitude of covariance between combined SSM model parameter single-body contribution decompositions

Leave-One-Out Cross-Validation for Missing Scapulae and Proximal Humeri. X’s denote test observations, box plots denote observation quartiles and extremes, and white circles denote observation averages

Scapula landmarks — lateral, anterior, posterior, and superior viewsFig. 2Humerus landmarks — medial, anterior, posterior, and superior viewsTable 4Scapula landmark descriptionsLandmarksAntero-Superior Glenoid RimAnterior AcromionInferior Infraspinatus FossaPostero-Superior Glenoid RimLateral AcromionPostero-Inferior Base of SpineAntero-Inferior Glenoid RimMedial AcromionMedio-Superior Subscapular FossaPostero-Inferior Glenoid RimPosterior AcromionMedial Trapezius Surface of SpineAnterior CoracoidSuperior AngleTrigonum SpinaeLateral CoracoidInferior AngleMid-Medial BorderPosterior CoracoidNadi

Combined SSM and One-Body SSM Model Generated Shape Plausibility. The width of the shaded area depicts the density of shape instances with that plausibility, and the dashed lines represent plausibility quartiles for each sample group

Scapular volumes input to ShapeWorks Studio for point correspondenceFig. 4Proximal humeral volumes input to ShapeWorks studio for point correspondence

Leave-One-Out Cross-Validation Errors for SSMs and Training Set Means. X’s denote test observations, box plots denote observation quartiles and extremes, and white circles denote observation averages

Generalisation Ability across Number of Training Samples for each SSM. (A) The distribution of distances between model outputs and the target shape point clouds (every odd Training Set Size shown), box plots denote observation quartiles and extremes and white circles denote distribution averages; and (B) The incremental decrease in mean average target shape point cloud distance for each additional training set sample

Funding1

—Michaeal Smith Health Research BC, Canada

Keywords

Statistical modelingShoulder arthroplastySurgical planningShape analysisComputational orthopedicsBiomechanicsAI-CDSS

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsShoulder Injury and Treatment · Mechanics and Biomechanics Studies · Musculoskeletal pain and rehabilitation

Full text

Background

Statistical shape models (“SSMs”) [1] are machine learning applications to bony anatomy used to analyze the variability in anatomical geometry across a population. These models provide a compact representation of bone morphology and can be used in orthopaedics descriptively, for example to investigate the anatomical contributions to joint pathogenesis [2–4]. Alternatively, SSM can be used generatively, for example to produce large sets of realistic synthetic shapes for computational analysis of surgical intervention for joint reconstruction [5, 6]. Previous research has demonstrated the necessity of shoulder arthroplasty technique adjustment in light of variation in glenohumeral morphology [7, 8]; SSMs empower the underlying biomechanical studies that inform the optimal adjustment of prosthesis selection and placement for an individual patient. SSMs leverage artificial intelligence to identify complex patterns in bone morphology across populations, enabling patient categorization in novel and potentially more powerful ways than traditional methods, which are inherently constrained by human-driven, low-dimensional assessments.

Single-body SSMs for individual anatomical structures, such as the scapula [9–11] and the humerus [12, 13], among others, are common in the literature. These models effectively describe modes of variation, achieving good generalizability and strong correspondence with clinically relevant anatomical measurements. However single-body SSMs, even if using two models to describe both the scapula and humerus [14, 15], are inherently limited in their scope as these models focus on local morphology and fail to analyze anatomical relationships (dependence) at the joint level [16]. Independent Scapula and Humerus SSMs that were developed in parallel have demonstrated significant correlation between the two bodies [14], but do not quantify how these correlations couple along unified modes of variation, for which a combined two-body SSM is necessary.

Multi-body SSMs have been developed to analyze relationships between multiple anatomical structures [17, 18], including the scapula and humerus, some with an emphasis on relative positioning between bones [19, 20]. However, these models were predominantly trained on non-pathologic scans and are thus limited in their applicability to clinical populations undergoing arthroplasty procedures. SSMs of arthropathic specimens are exceedingly rare, with one previous scapula model constructed using clinical patient datasets [9], and no such humeral models to date. No two-body SSM that captures the simultaneous coupled variation of the scapula and proximal humerus in populations requiring Reverse Total Shoulder Arthroplasty (“RTSA”) has been published to date.

The primary objectives of this study were twofold. First, to develop and validate a combined SSM of the scapula and proximal humerus that captures coupled variation of the two bodies and is useful in surgical planning, and describing pathologic shoulder anatomy. Second, to develop an analysis framework of the combined two-body scapula-proximal-humerus SSM that intuitively characterizes its utility in descriptive and generative capacities compared to independent one-body SSMs. This work provides a primer on statistical shape modeling and the extension into multi-domain shape models.

Methods

All numerical methods and visualizations were implemented in Python 3.9.13 [21–25].

Subject 3D volume representations

Preoperative CT scans were obtained from a heterogeneous patient cohort ( $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=45$$\end{document}$ ) who subsequently underwent RTSA. Subject scans were reviewed to ensure the inclusion of the distal deltoid insertion at the deltoid tuberosity of the humerus. This landmark was used to define the distal termination point for the proximal humerus, ensuring anatomical uniformity across study subjects. Demographic data for the study population is provided in Table 1. Table 1. Demographic data of all subjects in the training setSex/SideQuantityAge (Mean ± SD)Male2271.8 ± 5.69Female2372.2 ± 6.75Left1870.3 ± 7.56Right2773.1 ± 4.67All4571.9 ± 6.27

The proximal humerus, rather than the entire humerus, was selected for this study based on its clinical relevance given that the full humerus is not included in a conventional pre-operative arthroplasty imaging series of the shoulder [26, 27]. Thus, clinical populations typically have preoperative computed tomography (“CT”) scans [28] that capture the proximal humerus, but not the distal portion. As a result, focusing on the proximal humerus aligns with real-world clinical workflows and ensures the model reflects the data most commonly encountered in practice. Furthermore, the shape of the entire humerus can often be inferred from the proximal humeral morphology, as it carries much of the biomechanical and anatomical information needed for joint-level analysis [29, 30]. Including the deltoid insertion as a distal landmark ensures the incorporation of critical biomechanical details relevant to shoulder function and surgical planning.

Three-dimensional volumetric models of the scapulae and proximal humeri were generated for all subjects through manual segmentation using the commercial software, Mimics version 23 (Materialise, Leuven, Belgium). The segmentation outputs were remeshed to achieve uniform 1 mm edge-length triangular surface meshes using commercial software, 3-matic version 15 (Materialise, Leuven, Belgium). Additionally, to standardize the data, left-sided scapulae and proximal humeri were reflected across the sagittal plane, resulting in a dataset consisting exclusively of right-sided appearing bony anatomy. Radiographic morphological characteristics of interest for the study cohort are provided in Table 2. Table 2. Morphologic data across subjects in the training setDimensionMeanSDScapula Height (mm) [9]15411.8Scapula Width (mm) [9]1047.63Scapula Aspect Ratio (Width/Height)0.6730.0410Glenoid Height (mm) [9]39.54.11Glenoid Width (mm) [9]29.13.48Acromion Length (mm) [9]45.76.45Lateral Acromion to Glenoid Center (mm) [9]28.45.05Coracoid Tip to Glenoid Center (mm) [9]16.34.19Posterior-Inferior Acromion to Glenoid Center (mm) [9]40.14.55Superior-Anterior Acromion to Glenoid Center (mm) [9]6.265.46Fulcrum Axis (°) [9]94.13.12Glenoid Inclination Angle (°) [9]97.64.15Glenoid Version Angle (°) [9]94.17.57Acromial Tilt Angle (°) [9]32.43.58Critical Shoulder Angle (°) [9]29.45.02Superior-Inferior Glenoid-Acromion Angle (°) [9]52.47.70Superior-Inferior Glenoid-Acromion-Coracoid Angle (°) [9]95.44.57Proximal Humeral Length (mm)15410.3Humeral Shaft Diameter (mm) [12]22.42.50Humeral Head Radius of Rotation (mm) [31]23.32.20Humeral Head Inclination Angle (°) [32]1347.60Humeral Head Medial Offset (mm) [33]0.2613.83Humerus Greater Tuberosity Angle (°) [34]64.15.00Posterior Static Subluxation of the Humerus (mm) [31]6.322.56

Centred and aligned point correspondences

ShapeWorks Studio [35] software was used to develop point correspondences across the subject scapulae and proximal humeri; the set of scapulae and set of proximal humeri were processed separately to allow for different optimization parameters based on the distinct needs of each bone set (Table 3). Table 3. Parameters used for the ShapeWorks Studio point correspondence optimizationsOptimization ParameterScapulaeHumeriGrooming AlignmentLandmarksLandmarksNumber of Landmarks253Number of Particles10241024Initial Relative Weighting0.30.05Relative Weighting61Starting Regularization5001000Ending Regularization110Iterations per Split5001000Optimization Iterations5001000Geodesic Distance?YesNoGeodesic Remesh100%N/ANormals?YesNoNormals Strength7N/AUse Initial Landmarks?YesYesNarrow Band44

The 3D volume representations were first groomed in ShapeWorks Studio [36] by centering to a common origin, and then aligned using manually placed landmarks (Figs. 1 and 2, Tables 4 and 5).Fig. 1. Scapula landmarks — lateral, anterior, posterior, and superior viewsFig. 2Humerus landmarks — medial, anterior, posterior, and superior viewsTable 4Scapula landmark descriptionsLandmarksAntero-Superior Glenoid RimAnterior AcromionInferior Infraspinatus FossaPostero-Superior Glenoid RimLateral AcromionPostero-Inferior Base of SpineAntero-Inferior Glenoid RimMedial AcromionMedio-Superior Subscapular FossaPostero-Inferior Glenoid RimPosterior AcromionMedial Trapezius Surface of SpineAnterior CoracoidSuperior AngleTrigonum SpinaeLateral CoracoidInferior AngleMid-Medial BorderPosterior CoracoidNadir of Suprascapular NotchLateral Projection of Axillary BorderMedial Suprascapular NotchLateral Superior Scapular NotchApex of Coracoid ConcavitySpinoglenoid NotchTable 5Humerus landmark descriptionsLandmarksAntero-Lateral Lesser TuberosityAntero-Medial Greater TuberosityDistal Greater Tuberosity Ridge

The same landmarks were also used for correspondence point initialization. The difference in the number of landmarks used for the scapulae compared to the proximal humeri is due to the relative complexity of the scapular shape, which required substantially more landmarks to achieve acceptable correspondence point dispersion across the shape. Correspondence point cloud placement optimizations resulted in $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=45$$\end{document}$ corresponding scapula point-clouds, each with 1600 points defined by three-dimensional coordinate sets $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{({x}_{i } , {y}_{i} , {z}_{i})| i\,\epsilon\,{\mathbb{N}}\le 1600\}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=45$$\end{document}$ corresponding proximal-humerus point clouds, each with 1536 points defined by three-dimensional coordinate sets $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{({x}_{i }, {y}_{i} , {z}_{i})| i\,\epsilon\,{\mathbb{N}}\le 1536\}$$\end{document}$ .

Matrix representation of correspondence point clouds

The scapula point clouds were represented as vectors through simple concatenation of the point coordinates, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S = \{ {s}_{j }= [{x}_{1},{y}_{1},{z}_{1},{x}_{2},{y}_{2},{z}_{2}, ... , {x}_{m}, {y}_{m},{z}_{m}{]}^{T}|m =1600, j\,\epsilon\,{\mathbb{N}}\le 45 \}$$\end{document}$ . Similarly, the proximal-humerus point clouds were represented as the set of vectors $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H = \{ {h}_{j }= [{x}_{1},{y}_{1},{z}_{1},{x}_{2},{y}_{2},{z}_{2}, ... , {x}_{m}, {y}_{m},{z}_{m}{]}^{T}|m =1536, j\,\epsilon\,{\mathbb{N}}\le 45 \}$$\end{document}$ . Matching scapula and proximal-humerus pairs were also combined through simple concatenation to produce a set of two-body corresponding point-cloud vectors, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$C = \{ {c}_{j }= [{s}_{j}, {h}_{j}{]}^{T}|j\,\epsilon\,{\mathbb{N}}\le 45 \}$$\end{document}$ , with each $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${c}_{j}$$\end{document}$ a $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$3m=3(1600+1536) =9408$$\end{document}$ element vector. Each set $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S$$\end{document}$ , $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H$$\end{document}$ , and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$C$$\end{document}$ was then represented as a matrix through columnar concatenation of the set’s constituent vector elements, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X \to X = [ {x}_{1}, {x}_{2}, ... , {x}_{n} ]$$\end{document}$ .

(ℓ2-norm) principal component analysis

Principal Component Analysis (“PCA”) [37] proceeded on each of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S$$\end{document}$ , $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H$$\end{document}$ , and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{C}}$$\end{document}$ (or variants thereof when withholding subjects from the training sets in accordance with the following sections) in the usual fashion with the computation of the set’s covariance matrix, and the subsequent eigenvector decomposition thereof. The data on which the PCA was performed was not preprocessed by scaling, to maintain overall size variation information in the model. The data was centred such that each scapula and each proximal humerus has its centroid at the origin; centering consists of a three dimensional translation that does not affect shape and ensures that PCA captures shape and not position (or “pose” [19]) variation. A training set of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n$$\end{document}$ samples will yield only $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n-1$$\end{document}$ non-trivial eigenvectors, which can be explained by understanding the training set mean as the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n$$\end{document}$ -th degree of freedom in the system. Analogously, the span of two points in three-dimensional space can be defined by a (one-dimensional) vector and an offset, the span of three points in three-dimensional space can be defined by a (two-dimensional) plane and an offset, and so on. PCA results in sorted matrices of eigenvectors, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{S}$$\end{document}$ , $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{H}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{C}$$\end{document}$ , where each column in a matrix forms an eigenvector (or “principal component” or “mode of variation”) with corresponding eigenvalue greater than the corresponding eigenvalues of all eigenvectors to the right of it, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{X} = [{\varphi }_{1} ,{\varphi }_{2} , ... , {\varphi }_{n-1} ]$$\end{document}$ , where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varphi }_{i}\leftrightarrow {\lambda }_{i} , {\lambda }_{i}\ge {\lambda }_{i+1}$$\end{document}$ .

The matrices of sorted eigenvectors $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{S}$$\end{document}$ , $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{H}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{C}$$\end{document}$ paired with their accompanying training set means $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{{\varvec{x}} }_{{\varvec{S}}}}$$\end{document}$ , $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{{\varvec{x}} }_{{\varvec{H}}}}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{{\varvec{x}} }_{{\varvec{C}}}}$$\end{document}$ comprise three distinct SSMs, describing shape variations in scapulae, proximal humeri, and both scapulae and proximal humeri simultaneously, respectively. That is to say, given an eigenvector matrix $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Phi$$\end{document}$ , its corresponding eigenvalues $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\lambda }_{i}$$\end{document}$ , and the training data matrix $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X$$\end{document}$ , we can produce new shapes that are similar to the original set using $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y =\overline{x }$$\end{document}$ + $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Phi {\varvec{b}}$$\end{document}$ , where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{x } =\left\{[ \overline{{x }_{1}},\overline{{x }_{2}}, ... ,\overline{{x }_{3m}}{]}^{T} |\overline{{x }_{i}}=\frac{1}{n}{\sum }_{j=1}^{n}{X}_{i,j}\right\}$$\end{document}$ , and where we can assume $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{i} \sim N(0,{\lambda }_{i})$$\end{document}$ such that $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sum }_{i=1}^{n-1}\frac{{{b}_{i}}^{2}}{{\lambda }_{i}}\le {\chi }_{\alpha , n-1}^{2}$$\end{document}$ for desired certainty level α [38].

Distance measures

A suitable metric with which to measure the difference between two point clouds is necessary in order to assess the performance of the generated SSMs. For example, there is a need to quantify the error of a generated model output, itself a point cloud or equivalently a vector, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${o}= [{x}_{o,1},{y}_{o,1},{z}_{o,1},{x}_{o,2},{y}_{o,2},{z}_{o,2}, ... , {x}_{o,m}, {y}_{o,m},{z}_{o,m}{]}^{T}$$\end{document}$ , and the target point cloud attempted at being represented by the SSM, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${t}= [{x}_{t,1},{y}_{t,1},{z}_{t,1},{x}_{t,2},{y}_{t,2},{z}_{t,2}, ... , {x}_{t,m}, {y}_{t,m},{z}_{t,m}{]}^{T}$$\end{document}$ (i.e. error) denoted as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d(o,t)$$\end{document}$ .

The Root Mean Square (RMS) difference between vectors is often selected as the distance measure in the evaluation of SSMs [9, 11, 14, 18, 29], both for consistency with the interpretation of ℓ2-norm PCA as a minimization of a sum of squared errors of vector elements, and the relative ease of calculation:

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$RMS({\varvec{o}},{\varvec{t}})=\sqrt{\frac{1}{3m}{\sum }_{i=1}^{3m}({o}_{i}-{t}_{i}{)}^{2}}=\frac{1}{\sqrt{3m}}\sqrt{{\sum }_{i=1}^{m}\left(({x}_{o,i}-{x}_{t,i}{)}^{2}+({y}_{o,i}-{y}_{t,i}{)}^{2}+({z}_{o,i}-{z}_{t,i}{)}^{2}\right)}.$$\end{document}

However, this metric does not have a direct intuitive interpretation as to how the value translates to errors in model representations in 3D space. For this reason, the average Euclidean point distance between point clouds was instead selected as a distance measure:

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d({\varvec{o}},{\varvec{t}})= \frac{1}{m}{\sum }_{i=1}^{m}\sqrt{({x}_{o,i}-{x}_{t,i}{)}^{2}+({y}_{o,i}-{y}_{t,i}{)}^{2}+({z}_{o,i}-{z}_{t,i}{)}^{2}}.$$\end{document}

Care must be taken when comparing results reporting $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d(o,t)$$\end{document}$ with those reporting $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$RMS(o,t)$$\end{document}$ , as typically $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d(o,t) >RMS(o,t)$$\end{document}$ given Jensen’s inequality [39] for concave $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f(x) = \sqrt{x}$$\end{document}$ .

The Hausdorff metric [40, 41], the longest Euclidean distance between nearest neighbours across two point clouds, was selected as a measure of worst-case distance as is convention in the literature of computer vision and SSMs [9, 11, 15, 42, 43]. Considering each point cloud vector a set of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document}$ points, and each point itself a vector of 3D coordinates, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s=\{ s=[{x}_{i }, {y}_{i}, {z}_{i}]| i\,\epsilon\,{\mathbb{N}}\le m\}$$\end{document}$ , the following (bidirectional) Hausdorff metric is used [44]:

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${d}_{H}({\varvec{o}},{\varvec{t}})=max\left[ma{x}_{o \,\epsilon\, {\varvec{o}}}\left\{mi{n}_{t \,\epsilon\, {\varvec{t}} } d(o,t)\right\} , {max}_{t \,\epsilon\, {\varvec{t}}}\left\{mi{n}_{o \,\epsilon\, {\varvec{o}} } d(o,t)\right\}\right]$$\end{document}

Model descriptive ground truth accuracy

The Ground Truth Accuracy [45] of the generated SSMs was quantified via leave-one-out cross-validation. This test assesses the ability of the SSMs to describe a single shape outside of their training sets, and was repeated to iteratively exclude and subsequently model each of the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=45$$\end{document}$ subjects from the data sets. Thus, the results for each SSM form a distribution of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=45$$\end{document}$ test observations.

For each test, each of the three SSMs is regenerated using a modified training set, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${X}_{i}$$\end{document}$ , that excludes the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document}$ -th subject $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{i}$$\end{document}$ . We generate a parameter vector of best fit using all $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(n-1\right)-1=43$$\end{document}$ resulting non-trivial eigenvectors as principal components, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b ={{\Phi }^{+}(x}_{i}-\overline{{x }_{i}})$$\end{document}$ , where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{x }_{i}}$$\end{document}$ is the average shape of the training set $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${X}_{i}$$\end{document}$ . Here $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }^{+}$$\end{document}$ is the transpose of the eigenvector matrix (as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Phi {\Phi }^{T}={I}_{n}$$\end{document}$ by virtue of its semi-unitary semi-orthogonality and dimensions resulting in left-invertibility [46]). To ensure plausibility of the SSM representation in alignment with the assumption of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{i} \sim N(0,{\lambda }_{i})$$\end{document}$ , the resulting $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b$$\end{document}$ values are regulated, first by scaling individual elements such that $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left|{b}_{i}\right|\le 3\sqrt{{\lambda }_{i}}$$\end{document}$ , and subsequently by applying the uniform scaling factor $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s$$\end{document}$ closest to $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1$$\end{document}$ such that $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sum }_{i=1}^{n-1}\frac{{({s \times b}_{i})}^{2}}{{\lambda }_{i}}\le {\chi }_{0.997, n-1}^{2}$$\end{document}$ , resulting in $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\widehat{b}$$\end{document}$ that resides within the Mahalanobis hyperellipsoid [47] defining the region of acceptable plausibility [38]. The determination of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s$$\end{document}$ was implemented as a constrained quadratic programming [48] optimization problem minimizing $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(s-1{)}^{2}$$\end{document}$ . The SSM representation $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{i}$$\end{document}$ of the target shape $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{i}$$\end{document}$ , is then calculated $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{i} = \overline{{x }_{i}}$$\end{document}$ + $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Phi \widehat{{\varvec{b}}}$$\end{document}$ for which the average Euclidean point distance $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d({y}_{i},{x}_{i})$$\end{document}$ and Hausdorff metric $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${d}_{H}({y}_{i},{x}_{i})$$\end{document}$ are measured. The Euclidean point distance and Hausdorff metric between the training set average $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{x }_{i}}$$\end{document}$ and the target left-out sample $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{i}$$\end{document}$ are also calculated to compare the extent of SSM prediction accuracy that is attributable to the inclusion of the simple mean versus the inclusion of the weighted principal components.

Model generalisation ability

The concept of model Ground Truth Accuracy is generalised by assessing the ability of an SSM to describe a single shape outside of its training set while simultaneously restricting either the number of samples included in the training set or the number of principal components included in the SSM. At the extreme, when including a full training set of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n-1$$\end{document}$ samples and all available $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n-2$$\end{document}$ non-trivial principal components (given that one training set sample is left out), the measures of Generalisation Ability reduce to the model Ground Truth Accuracy; however by restricting these model design criteria, we gain insight into the sufficiency of model training set sizes and the model tolerances for dimensional reduction.

Model generalisation ability: number of training samples

For each SSM, Generalisation Ability for number of training samples [45] is assessed through leave-one-out cross-validation calculating the average Euclidean point distance between the left-out target and the model-generated shape, repeated across training sample sizes $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k=\{\text{2,3}, ..., n-1\}$$\end{document}$ , and iterated over the entire data set for each training sample size; each cross validation instance trains the SSMs on $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}$ randomly selected samples using all $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k-1$$\end{document}$ non-trivial principal components. Thus, the result for each SSM at each training set size is a distribution of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=45$$\end{document}$ test observations.

Model generalisation ability: number of principal components

For each SSM, Generalisation Ability for number of principal components [45] is assessed through leave-one-out cross-validation calculating the average Euclidean point distance between the left-out target and the model-generated shape, repeated across number of eigenvectors included as principal components $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k=\{\text{1,2},3, ..., n-2\}$$\end{document}$ , and iterated over the entire data set for each number of principal components. Thus, the result for each SSM at each number of principal components is a distribution of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=45$$\end{document}$ test observations.

Model specificity

The propensity of an SSM to produce only realistic shapes that are similar to the original training set is termed model Specificity [45]. Each SSM is trained on the entire data set of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=45$$\end{document}$ samples, then a large set of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$M=\text{10,000}$$\end{document}$ generated shapes are created by generating parameter vectors $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b$$\end{document}$ with randomly sampled $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{i} \sim N(0,{\lambda }_{i})$$\end{document}$ and iterating the entire process across a varying number of principal components included in the SSM, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k=\{\text{1,2},3, ..., n-1\}$$\end{document}$ . The results for each SSM at each number of principal components are distributions of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$M$$\end{document}$ observations of the average Euclidean point distances and Hausdorff metrics between the generated shapes and the closest shapes within the training set.

Model compactness

SSM Compactness explores the number of eigenvectors required as principal components included in an SSM in order to describe a desired amount of variation observed within the training set [45]. Each constituent principal component has a corresponding eigenvalue $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\lambda }_{i}$$\end{document}$ , itself a quantification of the observed training set variation explained by the principal component. For a given number of principal components, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}$ , we define Compactness of an SSM described by eigenvector matrix $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{X}$$\end{document}$ as the proportion of total training set variance described by the first $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}$ principal components: $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c(\Phi ,k)=\frac{{\sum }_{i=1}^{k}{\lambda }_{i}}{{\sum }_{i=1}^{n-1}{\lambda }_{i}}$$\end{document}$ . Compactness plots are produced by varying $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k\in {\mathbb{N}}\le n-1$$\end{document}$ .

SSM modes of variation

The range of shapes occurring along one of the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n-1$$\end{document}$ modes of variation of an SSM, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varphi }_{i}$$\end{document}$ , can be investigated by starting with the average shape across the training set $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{\varvec{x}} }$$\end{document}$ , and generating model outputs $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{y}}}_{i,j} = \overline{{\varvec{x}} }$$\end{document}$ + $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\varphi }}_{i}{b}_{i,j}$$\end{document}$ while sweeping across the feasible range of the corresponding model parameter, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-3\sqrt{{\lambda }_{i}}\le {b}_{i,j}\le 3\sqrt{{\lambda }_{i}}$$\end{document}$ .

This analysis provides insight into the shape variations that are characteristic of the modelled bone across the study population, the relative importance of each shape variation in describing the heterogeneity of the study population, and a point of comparison between this work and previous scapular and proximal-humeral SSMs.

Quantification of coupled scapulohumeral variation

An understanding of the characteristic variations described by the single-body SSMs can be applied to the Combined scapula and proximal-humerus SSM by expressing the principal components of the Combined SSM as linear combinations of the principal components of the single-body SSMs. This decomposition also facilitates addressing the utility of a combined two-body model as opposed to using two independent single-body models when generating synthetic scapulohumeral shapes.

Given that the scapula and humerus data set vectors are of the form $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[{x}_{1},{y}_{1},{z}_{1},{x}_{2},{y}_{2},{z}_{2}, ... , {x}_{m}, {y}_{m},{z}_{m}{]}^{T}$$\end{document}$ and the combined data set vectors are in turn of the form $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$[{s}_{j }, {h}_{j}{]}^{T}$$\end{document}$ , it follows that eigenvectors of the combined SSM are of the form $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left[{\phi }_{S},{\phi }_{H}\right]}^{T}$$\end{document}$ such that the combined SSM eigenvector matrix $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{C}$$\end{document}$ consists of an upper portion $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{C,S}$$\end{document}$ relating to the scapula and a lower portion $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Phi }_{C,H}$$\end{document}$ relating to the humerus, which can be decomposed separately.

For each constituent eigenvector in $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Phi }}_{C,S}$$\end{document}$ (denoted $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\varphi }}_{C,S,j}$$\end{document}$ ) we seek a linear combination of constituent eigenvectors in $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Phi }}_{S}$$\end{document}$ defined as vector $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{\gamma}}}_{j}$$\end{document}$ , such that $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\varphi }}_{C,S,j}={{\boldsymbol{\Phi }}_{S}}^{T}{{\varvec{\gamma}}}_{j} \Rightarrow {{\varvec{\gamma}}}_{j} ={{\boldsymbol{\Phi }}_{S}}^{+}{\boldsymbol{\varphi }}_{C,S,j}$$\end{document}$ , where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\Phi }}_{S}}^{+}$$\end{document}$ is simply the eigenvector matrix (as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\boldsymbol{\Phi }{\boldsymbol{\Phi }}^{{\varvec{T}}}={{\varvec{I}}}_{n}$$\end{document}$ by virtue of its semi-unitary semi-orthogonality and dimensions resulting in left-invertibility).

The single-body eigenvector linear combination weights $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\gamma }_{j}$$\end{document}$ are converted into directionless (absolute) proportions of contribution magnitude to the resulting combined model eigenvector $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\varphi }}_{C,S,j}$$\end{document}$ by $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\alpha }}_{S,j}=\left[{\alpha }_{S,i}=\frac{\left|{\gamma }_{{j}_{i}}\right|}{{\sum }_{k=1}^{n-1}\left|{\gamma }_{{j}_{k}}\right|}\right]$$\end{document}$ where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\gamma }_{j}}_{i}$$\end{document}$ denotes the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document}$ -th element of the vector $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{\gamma}}}_{j}$$\end{document}$ . The resulting $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\boldsymbol{\alpha }}_{S,j}}^{T}$$\end{document}$ vectors are row-concatenated to form weighting matrix $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{S}$$\end{document}$ which can subsequently be visualised through plotting as a heatmap; the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i,j$$\end{document}$ -th element of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{S}$$\end{document}$ corresponds to the proportion of contribution of the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document}$ -th scapula single-body SSM eigenvector to the scapular component of the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document}$ -th Combined SSM eigenvector.

Noting that the eigenvalues resulting from PCA encode the proportion of total training set variance explained by the accompanying eigenvector by $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{{\lambda }_{i}}{{\sum }_{j=1}^{n-1}{\lambda }_{j}}$$\end{document}$ [37], and that the variance of a linear combination of orthogonal elements $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y = {\alpha }_{1}{X}_{1}+ ... +{\alpha }_{m}{X}_{m}$$\end{document}$ is $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\alpha }_{1}}^{2}VAR[{X}_{1}]+ ... + {{\alpha }_{m}}^{2}VAR[{X}_{m}]$$\end{document}$ [49], we can calculate the proportion of scapula dataset variance described by the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document}$ -th principal component of the Combined SSM as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${v}_{C,S,j}={{{\varvec{\gamma}}}_{j}}^{*2}{{\varvec{\lambda}}}_{s}$$\end{document}$ where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\varvec{\gamma}}}_{j}}^{*2}$$\end{document}$ denotes item-wise squared vector, and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{\lambda}}}_{s}$$\end{document}$ is a vector of the proportions of total scapula training set variance described by each Scapula SSM principal component.

The constituent eigenvectors of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Phi }}_{C,H}$$\end{document}$ are decomposed into linear combinations of the constituent eigenvectors of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Phi }}_{H}$$\end{document}$ , processed into $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{H}$$\end{document}$ , and used to calculate $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${v}_{C,H,j}$$\end{document}$ , analogously.

The coupled variations for single body shapes that coincide in one principal component of the Combined SSM, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\varphi }}_{C,i}$$\end{document}$ , are then given by $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${v}_{J,j}= min\left({v}_{C,S,j} , {v}_{C,H,j}\right)$$\end{document}$ .

Prediction of missing counterpart

The consideration of a two-body Combined SSM rather than two independent single-body SSMs naturally leads to the question of model prediction accuracy in contexts where only one body is provided (e.g. in medical imaging) and the shape of the matching second body must be predicted. With the two independent single-body SSMs, no coupled variation between the two bodies is modeled, and so the missing body can only be estimated as the training set mean, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E({{\varvec{x}}}_{2,i}|{{\varvec{x}}}_{1,i}) =\overline{{{\varvec{x}} }_{2}}$$\end{document}$ [50].

However, the two-body Combined SSM can be assessed for coupled variation between the two bodies; if the coupled variation is present, it may be used to improve the prediction accuracy beyond that of the training set mean. Specifically, recalling that Combined SSM model parameters are calculated as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{i} ={\boldsymbol{\Phi }}_{C}^{T}({{\varvec{x}}}_{i}-\overline{{\varvec{x}} })$$\end{document}$ and noting that $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Phi }}_{C}$$\end{document}$ , $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{x}}}_{i}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{\varvec{x}} }$$\end{document}$ can each be decomposed into constituent scapular and proximal humeral components, results in the following two-body decomposition for Combined SSM Model parameters:

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{{\varvec{i}}}={{\varvec{b}}}_{S,i} +{{\varvec{b}}}_{H,i} =={\boldsymbol{\Phi }}_{C,S}^{T}({{\varvec{x}}}_{S,i}-\overline{{{\varvec{x}} }_{S}})+{\boldsymbol{\Phi }}_{C,H}^{T}({{\varvec{x}}}_{H,i}-\overline{{{\varvec{x}} }_{H}}).$$\end{document}

Linear covariation between $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{S}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{H}$$\end{document}$ can be assessed through the covariance matrix (the derivation of which is omitted for brevity), $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Cov({{\varvec{b}}}_{S},{{\varvec{b}}}_{H})=\left({\boldsymbol{\Phi }}_{C,S}^{T}{ \boldsymbol{\Sigma }}_{S,H}{\boldsymbol{\Phi }}_{C,H}\right)$$\end{document}$ , where $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Cov({{\varvec{x}}}_{S},{{\varvec{x}}}_{H})=\left[\begin{array}{cc}{\boldsymbol{\Sigma }}_{S,S}& {\boldsymbol{\Sigma }}_{S,H}\\ {\boldsymbol{\Sigma }}_{H,S}& {\boldsymbol{\Sigma }}_{H,H}\end{array}\right]$$\end{document}$ [51]. The relative magnitude of covariance between the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document}$ -th scapular model component contribution and the $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j$$\end{document}$ -th proximal-humeral model component contribution to the Combined SSM model parameters across the training set, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{B}$$\end{document}$ , can be visualised by a heatmap of the absolute values of the covariance, standardised to the maximum absolute element in the covariance matrix.

In light of the above decomposition, the naive estimate for the missing shape (i.e. the training set mean shape), $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E({{\varvec{x}}}_{2,i}|{{\varvec{x}}}_{1,i}) =E({{\varvec{x}}}_{2,i}) = \overline{{{\varvec{x}} }_{2}}$$\end{document}$ can be understood as assuming $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{2,i}\approx 0$$\end{document}$ , since $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{x}}}_{2,i} ={\boldsymbol{\Phi }}_{C,2}\left({{\varvec{b}}}_{i} -{{\boldsymbol{\Phi }}_{C,1}}^{T}({{\varvec{x}}}_{1,i}-\overline{{{\varvec{x}} }_{1}})\right)+\overline{{{\varvec{x}} }_{2}}$$\end{document}$ and therefore $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{i}-{{\varvec{b}}}_{1,i}=0$$\end{document}$ . With a Combined SSM, in the event of correlation between $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{1,i}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{2,i}$$\end{document}$ , $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{2,i}$$\end{document}$ can instead be predicted from $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{1,i}$$\end{document}$ , for example through a form of regression, as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E({{\varvec{x}}}_{2,i}|{{\varvec{x}}}_{1,i}) =\overline{{{\varvec{x}} }_{2}}$$\end{document}$ + $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\Phi }}_{C,2}E({{\varvec{b}}}_{2,i}|{{\varvec{b}}}_{1,i})$$\end{document}$ .

Leave-one-out cross-validation is used to quantify the relative accuracy in predicting the shape of missing bodies between the independent single-body SSMs and the Combined SSM. The SSMs are first trained on $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n-1$$\end{document}$ paired scapula-proximal-humerus samples, and then predict the shape of a missing scapula based on the corresponding proximal humerus, and then vice versa. The independent single-body SSMs predict the missing bodies using the training set means, and the Combined SSM predicts the missing body by calculating $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{{\varvec{b}}}_{1,i}= \boldsymbol{\Phi }}_{C,1}^{T}({{\varvec{x}}}_{1,i}-\overline{{{\varvec{x}} }_{1}})$$\end{document}$ , using a Random Forest Regressor to predict $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{2,i}$$\end{document}$ from $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{1,i}$$\end{document}$ , filtering the predicted $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{b}}}_{2,i}$$\end{document}$ into $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\widehat{{{\varvec{b}}}_{2,i}}$$\end{document}$ using the same procedure as for the Ground Truth and Generalisation tests, and then generating a shape prediction $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{y}}}_{2,i} = \overline{{{\varvec{x}} }_{2,i}}+ {\boldsymbol{\Phi }}_{C,2} \widehat{{{\varvec{b}}}_{2,i}}$$\end{document}$ . The results for the two methods form distributions of $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=45$$\end{document}$ average Euclidean point distance and Hausdorff metric observations for the scapula and proximal humerus.

De Novo generation of synthetic populations

Synthetic populations for computational analyses must consist of individuals that are sufficiently realistic, while collectively capturing the entire range of possible variations expected within the emulated population. In other words, an unbiased generated population would produce a uniform distribution when evaluated through the population’s cumulative distribution function (if known). This is the property that is exploited in Inverse Transform Sampling [52], which is effectively the opposite process of how we proceed. We define the plausibility of a shape as the probability of observing a shape that is more extreme than it, calculated as 1 minus the cumulative distribution function value for the shape.

When generating synthetic shapes from an SSM the model parameters are often constrained, such as $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left|{b}_{i}\right|\le 3\sqrt{{\lambda }_{i}}$$\end{document}$ , due to the underlying assumption of independent $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{i} \sim N(0,{\lambda }_{i})$$\end{document}$ [45]. As a result of this assumption, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sum \frac{{{b}_{i}}^{2}}{{\lambda }_{i}}\le {\chi }_{\alpha , n-1}^{2}$$\end{document}$ [47] and therefore the probability of obtaining a set of model parameters $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b$$\end{document}$ less than a particular $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\widehat{b}$$\end{document}$ is given by the cumulative distribution function $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${F}_{{\chi }^{2}}\left(\sum \frac{{{\widehat{b}}_{i}}^{2}}{{\lambda }_{i}},n-1\right)$$\end{document}$ [38]. To assess for relative bias in the generation of synthetic shapes when using two independent single-body SSMs compared to using a single Combined SSM, a large set of scapula and proximal humerus shapes is generated by sampling from $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{S,i} \sim N(0,{\lambda }_{S,i})$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{H,i} \sim N\left(0,{\lambda }_{H,i}\right)$$\end{document}$ $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$M=\text{10,000}$$\end{document}$ times, fitting Combined SSM model parameters $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{C,i}$$\end{document}$ to each generated shape pair, and computing the plausibility ( $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1-{F}_{{\chi }^{2}}$$\end{document}$ ) for each shape pair. The plausibility is an inverse measure of a generated shape’s deviation from the training set mean. Similarly, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$M=\text{10,000}$$\end{document}$ $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{C,i}$$\end{document}$ are sampled directly from the Combined SSM model parameter distribution $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{C,i} \sim N(0,{\lambda }_{C,i})$$\end{document}$ and the resulting plausibilities are calculated. Since the plausibilities of generated shapes would be uniformly distributed between 0 and 1 if the shape generating mechanism is unbiased, the plausibilities are visualized and assessed using the Kolmogorov–Smirnov (“KS”) Test [53] for the hypothesis of non-uniformity.

Results

Subject 3D volume representations

The 45 scapular and proximal-humeral samples resulting from segmentation, remeshing, and reflection are depicted in Figs. 3 and 4.Fig. 3. Scapular volumes input to ShapeWorks Studio for point correspondenceFig. 4Proximal humeral volumes input to ShapeWorks studio for point correspondence

Generated statistical shape models

Visualisations of the training set averages $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{\varvec{x}} }$$\end{document}$ for the Scapula SSM (Fig. 5), Proximal-Humerus SSM (Fig. 6), and Combined SSM (Fig. 7), follow.Fig. 5. Scapula-only SSM training set average point cloud-lateral, anterior, posterior, and superior viewsFig. 6Proximal-humerus-only SSM training set average point cloud-medial, anterior, posterior, and superior viewsFig. 7Combined two-body SSM training set average point cloud-anterior, posterior, and superior views

Model descriptive ground truth accuracy

The results of the leave-out-one cross-validation tests are plotted in Fig. 8. Ground Truth Accuracy is markedly improved when using an SSM compared to using the mean shape of the training set. The Scapula and Combined SSMs have consistently higher errors than the Proximal-Humerus SSM.Fig. 8. Leave-One-Out Cross-Validation Errors for SSMs and Training Set Means. X’s denote test observations, box plots denote observation quartiles and extremes, and white circles denote observation averages

Model generalisation ability: number of training samples

The Generalisation of each SSM across training set size is provided in Fig. 9. These results consist of error distributions with decreasing means and ranges plus a random error signal introduced by the random sampling with which the training sets were selected. The Combined SSM shows superior performance initially but converges with the Scapula SSM, whereas the Proximal-Humerus SSM consistently achieves the lowest point distances. The mean average point distances for all three SSMs exhibit similar convergence patterns as functions of training set size.Fig. 9. Generalisation Ability across Number of Training Samples for each SSM. (A) The distribution of distances between model outputs and the target shape point clouds (every odd Training Set Size shown), box plots denote observation quartiles and extremes and white circles denote distribution averages; and (B) The incremental decrease in mean average target shape point cloud distance for each additional training set sample

Model generalisation ability: number of principal components

The Generalisation of each SSM across the number of included principal components is provided in Fig. 10. Similar to the previous section, these results consist of error distributions with decreasing means and ranges. The Scapula and Combined SSM perform comparably, and the Proximal-Humerus SSM has consistently lower point distances. The mean average point distances for all three SSMs exhibit similar convergence patterns as functions of training set size, with the Proximal-Humerus SSM converging first, followed by the Scapula SSM, and finally the Combined SSM.Fig. 10. Generalisation Ability across Number of Principal Components for each SSM. (A) The distribution of distances between model outputs and the target shape point clouds (every odd Number of Principal Components shown), box plots denote observation quartiles and extremes and white circles denote distribution averages; and (B) The incremental decrease in mean average target shape point cloud distance for each additional Principal Component included in the SSM

Model specificity

The Specificity of each SSM across the number of included principal components, with $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$M=\text{10,000}$$\end{document}$ randomly generated shape instances for each test, is provided in Fig. 11. The SSMs display first-order dynamics, with generated shapes initially closely resembling the training sets and gradually converging at greater average distances. The Combined SSM performs similarly to the Scapula SSM but with a reduced range, whereas the Proximal-Humerus SSM shows significantly lower mean average point distances and a narrower distance range compared to the other two.Fig. 11. Specificity across Number of Principal Components for each SSM. (A) The distribution of distances between model outputs and the closest training set shape point clouds (every odd Number of Principal Components shown), box plots denote observation quartiles and extremes and white circles denote distribution averages; and (B) The incremental increase in mean average target shape point cloud distance for each additional Principal Component included in the SSM

Model compactness

The total training set variation explained by number principal components, for each SSM, is provided in Fig. 12. The Proximal-Humerus SSM is the most compact, followed by the Scapula SSM, with the Combined SSM being the least compact.Fig. 12. Compactness of each SSM. The number of principal components required to capture at least 90% of the total data set variation for each shape is designated by the vertical dashed lines and color-coordinated x-axis tick labels

SSM modes of variation

The shape variations contributed by the first 6 principal components (“modes”) of the Scapula SSM are depicted in Fig. 13 by visualising the extent of point displacement caused by adding three standard deviations along each mode of variation, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$3\sqrt{{\lambda }_{i}}{ \boldsymbol{\varphi }}_{i}$$\end{document}$ , to the training set average point cloud $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\overline{{\varvec{x}} }$$\end{document}$ . The same is depicted for the Proximal-Humerus SSM in Fig. 14, and for the Combined SSM in Fig. 15.Fig. 13. Primary 6 Modes of Variation of the Scapula SSM, Lateral Oblique View. The extent of point displacement along the mode of variation is encoded in marker size (small to large) and colour (blue through yellow to red)Fig. 14. Primary 6 Modes of Variation of the Proximal-Humerus SSM, Medial View. The extent of point displacement along the mode of variation is encoded in marker size (small to large) and colour (blue through yellow to red)Fig. 15. Primary 6 Modes of Variation of the Combined SSM, Lateral Oblique View. The extent of point displacement along the mode of variation is encoded in marker size (small to large) and colour (blue through yellow to red)

The range of radiographic morphologic characteristics of scapular anatomy [9] observed when varying the Scapula SSM across three standard deviations in the first three modes is listed in Table 6. The same is depicted for the Proximal-Humerus SSM [31–34] in Table 7, and for the Combined SSM in Table 8. Table 6. Morphologic characteristics across scapula SSM modesDimensionMeanMode 1 ± 3SDRange (Min—Max)Mode 2 ± 3SDRange (Min—Max)Mode 3 ± 3SDRange (Min—Max)Scapula Height (mm) [9]15462.4 (124—186)22.3 (143—166)12.9 (148—161)Scapula Width (mm) [9]10440.6 (83.2—124)11.7 (97.8—109)6.83 (100—107)Scapula Aspect Ratio (Width/Height)0.6720.008 (0.666—0.674)0.174 (0.591—0.764)0.101 (0.623—0.724)Glenoid Height (mm) [9]39.619.0 (30.1–49.0)6.94 (36.1–43.0)1.70 (38.7—40.4)Glenoid Width (mm) [9]29.116.2 (21.0—37.3)2.74 (27.8—30.5)0.388 (29.0—29.3)Acromion Length (mm) [9]45.325.8 (32.5—58.3)19.2 (37.1—56.3)4.11 (44.1—48.2)Lateral Acromion to Glenoid Center (mm) [9]28.511.3 (22.9—34.1)8.01 (24.3—32.3)2.63 (27.2—29.9)Coracoid Tip to Glenoid Center (mm) [9]16.35.77 (13.4—19.2)6.74 (13.0—19.7)5.27 (13.6—18.9)Posterior-Inferior Acromion to Glenoid Center (mm) [9]40.122.4 (29.1—51.5)0.556 (39.6—40.1)12.8 (33.7—46.5)Superior-Anterior Acromion to Glenoid Center (mm) [9]4.940.658 (4.68—5.34)13.1 (4.94–18.0)11.3 (4.94—16.3)Fulcrum Axis (°) [9]94.10.255 (94—94.2)5.65 (91.4—97.1)1.02 (93.5—94.5)Glenoid Inclination Angle (°) [9]97.62.88 (95.8—98.7)1.42 (97.2—98.7)0.305 (97.5—97.8)Glenoid Version Angle (°) [9]94.45.33 (90.9—96.3)24.1 (81.5—106)9.21 (89.7—98.9)Acromial Tilt Angle (°) [9]31.81.64 (30.7—32.3)13.0 (27.7—40.7)10.7 (27.0—37.8)Critical Shoulder Angle (°) [9]29.40.888 (28.8—29.7)3.85 (27.8—31.7)2.29 (28.3—30.6)Superior-Inferior Glenoid-Acromion Angle (°) [9]52.92.80 (52.0—54.8)27.2 (36.4—63.7)19.5 (43.1—62.6)Superior-Inferior Glenoid-Acromion-Coracoid Angle (°) [9]95.31.17 (94.5—95.7)14.3 (88.6—103)8.75 (90.4—99.2)Table 7. Morphologic characteristics across proximal-humerus SSM modesDimensionMeanMode 1 ± 3SDRange (Min—Max)Mode 2 ± 3SDRange (Min—Max)Mode 3 ± 3SDRange (Min—Max)Proximal Humeral Length (mm)15225.3 (140—165)1.90 (152—154)7.36 (149—156)Humeral Shaft Diameter (mm) [12]22.213.0 (15.8—28.7)1.74 (21.3—23.1)2.51 (21.0—23.5)Humeral Head Radius of Rotation (mm) [31]22.99.43 (18—27.4)0.636 (22.7—23.4)3.71 (21.7—25.4)Humeral Head Inclination Angle (°) [32]13935.6 (105—140)1.47 (138—139)23.1 (116—139)Humeral Head Medial Offset (mm) [33]1.752.96 (0.175—3.14)0.672 (1.35—2.02)2.71 (0.831—3.54)Humerus Greater Tuberosity Angle (°) [34]67.51.16 (66.3—67.5)5.45 (63.1—68.5)21.9 (45.5—67.5)Table 8. Morphologic characteristics across combined SSM modesDimensionMeanMode 1 ± 3SDRange (Min—Max)Mode 2 ± 3SDRange (Min—Max)Mode 3 ± 3SDRange (Min—Max)Scapula Height (mm) [9]15462.8 (123—186)7.65 (150—158)19.9 (144—164)Scapula Width (mm) [9]10440.0 (83.6—124)12.1 (97.5—110)8.64 (99.3—108)Scapula Aspect Ratio (Width/Height)0.6720.0138 (0.664—0.678)0.112 (0.617—0.729)0.143 (0.604—0.747)Glenoid Height (mm) [9]39.619.1 (30.0—49.1)2.84 (38.1—41.0)6.42 (36.3—42.7)Glenoid Width (mm) [9]29.116.4 (20.9—37.3)1.80 (28.2—30.0)0.341 (29.0—29.4)Acromion Length (mm) [9]45.326.2 (32.3—58.5)9.73 (41.2—51.0)15.0 (38.4—53.4)Lateral Acromion to Glenoid Center (mm) [9]28.511.3 (22.9—34.2)2.26 (27.2—29.5)8.02 (24.3—32.4)Coracoid Tip to Glenoid Center (mm) [9]16.35.96 (13.3—19.3)3.85 (14.4—18.2)5.42 (13.6—19.0)Posterior-Inferior Acromion to Glenoid Center (mm) [9]40.122.1 (29.2—51.3)3.33 (38.2—41.6)3.47 (38.2—41.7)Superior-Anterior Acromion to Glenoid Center (mm) [9]4.940.441 (4.79—5.23)9.80 (4.41—14.2)9.85 (3.26—13.1)Fulcrum Axis (°) [9]94.10.209 (94.0—94.2)5.74 (91.3—97.1)0.753 (93.8—94.5)Glenoid Inclination Angle (°) [9]97.62.70 (95.9—98.6)2.02 (96.8—98.8)0.439 (97.5—98)Glenoid Version Angle (°) [9]94.45.26 (91.0—96.2)9.41 (89.5—98.9)25.0 (81.5—106)Acromial Tilt Angle (°) [9]31.81.29 (30.9—32.2)7.04 (29.1—36.2)11.1 (27.6—38.7)Critical Shoulder Angle (°) [9]29.40.492 (29.0—29.5)4.93 (27.1—32.0)0.978 (29—29.9)Superior-Inferior Glenoid-Acromion Angle (°) [9]52.91.48 (52.4—53.9)19.8 (42—61.8)18.0 (42.4—60.4)Superior-Inferior Glenoid-Acromion-Coracoid Angle (°) [9]95.30.190 (95.2—95.3)12.8 (89.0—102)6.97 (92.0—99.0)Proximal Humeral Length (mm)15210.3 (148—158)20.0 (143—163)11.2 (147—158)Humeral Shaft Diameter (mm) [12]22.27.84 (18.3—26.1)10.3 (17.1—27.4)2.39 (21—23.4)Humeral Head Radius of Rotation (mm) [31]22.93.99 (21.3—25.3)7.78 (19—26.8)2.06 (21.7—23.8)Humeral Head Inclination Angle (°) [32]13940.4 (98.8—139)10.8 (131—142)29.0 (110—139)Humeral Head Medial Offset (mm) [33]1.751.90 (0.167—2.07)1.54 (0.925—2.46)4.18 (0.906—5.09)Humerus Greater Tuberosity Angle (°) [34]67.58.96 (64.0—73.0)5.14 (62.5—67.6)13.5 (53.9—67.5)

Quantification of coupled scapulohumeral variation

The decomposition of the Combined SSM scapular and proximal-humeral eigenvector components into relative magnitude of Scapula SSM and Proximal-Humerus SSM eigenvector contributions, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{S}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{H}$$\end{document}$ , is provided in Fig. 16. The scapular component decomposition of the Combined SSM more closely approximates an identity matrix compared to the proximal humeral component decomposition. The later Combined SSM component decompositions become progressively diffuse, with a greater number of constituents each contributing less.Fig. 16. Decomposition of the modes of variation of the combined SSM into single-body SSM principal components

The proportion of single-body shape variation captured in each Combined SSM Principal Component, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{v}}}_{C,S,j}$$\end{document}$ and $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{v}}}_{C,H,j}$$\end{document}$ are provided in Fig. 17, along with the cumulative sum of the coupled variation of the two bodies that jointly occurs along Combined SSM Principal Components, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{v}}}_{J,j}$$\end{document}$ .Fig. 17. Single-body variation and cumulative coupled variation in each combined SSM principal component

Prediction of missing counterpart

The relative magnitude of elements of the covariance matrix of the scapular and proximal-humeral contributions to Combined SSM fitted model parameters across the study data set, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{B}$$\end{document}$ , is visualized with the heatmap in Fig. 18; the matrix is sparse with significant magnitudes only occurring in the earlier rows and columns.Fig. 18. Relative magnitude of covariance between combined SSM model parameter single-body contribution decompositions

The results of the missing body prediction test are plotted in Fig. 19. The use of the SSM lowers the median Scapula prediction error magnitude but increases the worst-case errors, compared to the training set average. The Proximal-Humerus SSM decreases the error range compared to the training set average.Fig. 19. Leave-One-Out Cross-Validation for Missing Scapulae and Proximal Humeri. X’s denote test observations, box plots denote observation quartiles and extremes, and white circles denote observation averages

De Novogeneration of synthetic populations

The distribution of plausibilities (p-values) resulting from the shapes generated by the Combined SSM compared to the two independent one-body SSMs is graphed in Fig. 20. The plausibilities generated by the two independent one-body SSMs exhibit two distinct peaks, one at high plausibility and the other at low plausibility.Fig. 20. Combined SSM and One-Body SSM Model Generated Shape Plausibility. The width of the shaded area depicts the density of shape instances with that plausibility, and the dashed lines represent plausibility quartiles for each sample group

The KS Test for non-uniformity of the shape plausibilities (where p < .05 suggests a sampling from a uniform distribution) results in p $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$< .001$$\end{document}$ for the shapes generated by the Combined SSM and p = $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$.585$$\end{document}$ for the shapes generated from the two independent one-body SSMs.

Discussion

The analysis in the following subsections validates the independent single-body SSMs, as their performance aligns with previously reported models. Although no prior two-body SSM exists for direct comparison, the Combined SSM exhibits intuitive, explainable behavior relative to the independent single-body SSMs. Similarly, the primary modes of variation identified by the single-body SSMs are consistent with those documented in the literature. Notably, Mode 3 of the Combined SSM captures the pathological variation of the joint by encoding the covariation between glenoid and humeral head erosion. In fact, the Combined SSM demonstrates that 43.2% of the total scapular and proximal-humeral shape variation occurs in tandem. This coupled variation enables the Combined SSM to improve missing counterpart predictions, achieving over a 10% reduction in median error and a more than 9% reduction in error interquartile range for missing scapulae and proximal humeri, respectively. Conversely, the use of two independent single-body models for de novo synthetic population generation introduces bias, producing both implausible (mismatched) shape pairs and insufficient diversity within the generated population. We envision the novel combined SSM contributing to surgical decision-making and planning by: enhancing evidence-based selection of optimal RTSA prosthesis configuration and placement, providing a new quantitative measure of the severity of arthropathy, guiding complex reconstruction in the setting of comminuted scapular or humeral fractures, and automating the radiographic identification of glenohumeral arthropathy.

Subject 3D volume representations

Notable variability is observed in the extent of glenoid erosion among the scapulae and head erosion in the humeri. These findings align with the anticipated heterogeneity of the clinical population, which comprises shoulder arthritis patients undergoing RTSA. The varying degrees of erosion reflect the diversity in disease progression and severity across the study cohort, reinforcing the importance of capturing such anatomical variation for accurate statistical shape modeling.

Generated statistical shape models

The mean shapes of both the scapula and humerus demonstrate acceptable correspondence point dispersion, effectively capturing the typical morphology of bony anatomy. This uniform distribution of points ensures an accurate representation of shape variations within the study population. There are no apparent incongruities in the point clouds, which would manifest as sharp disruptions or irregularities in the otherwise smooth 3D mesh formed between the points. The absence of such inconsistencies is suggestive of successful segmentation and correspondence optimization processes.

Model descriptive ground truth accuracy

The Hausdorff metric is consistently greater than the average point distance across all cases, as expected since it serves as a worst-case distance measure. This metric highlights the outermost variations in shape representation, complementing the average point distance for assessing model accuracy.

All three SSMs significantly outperform the Training Set Mean predictions in terms of both error central tendency and dispersion. For example, the Combined SSM achieves an average point distance with median 1.13 mm (IQR 0.239 mm), compared to the Training Set Mean, which exhibits a median of 3.34 mm (IQR 1.32 mm). The difference in average point distance distributions between the SSM and the Training Set Mean is a direct measure of the descriptive benefit of the shape variation captured by the principal components of the SSMs.

The average point distance errors and Hausdorff metrics observed for the Scapula and Proximal-Humerus SSMs are within the range reported in the literature (Scapula mean RMS errors reported between 1.7 and 2.5 mm, maximum Hausdorff metrics reported between 5.5 and 18 mm; Humerus mean RMS errors reported between 0.48 and 2.1 mm, maximum Hausdorff Metrics reported between 4 and 5.5 mm) [11–15, 29]. The Proximal-Humerus SSM yields reduced errors compared to the Scapula SSM as expected given the higher complexity of scapular morphology. The Proximal-Humerus SSM achieves a Hausdorff metric distribution with median 2.71 mm (IQR 0.452 mm), while the Scapula SSM distribution has median 5.05 mm (IQR 1.42 mm).

The Combined SSM performs comparably to the Scapula SSM but demonstrates slightly higher central tendency errors with reduced dispersion. For example, the Combined SSM has an average point distance distribution with median 1.13 mm (IQR 0.239 mm), whereas the Scapula SSM achieves an error distribution with median 1.08 mm (IQR 0.359 mm).

Interestingly, an alternative and simpler independent two-body shape model, defined by the concatenation of the two independent single-body models, would yield error metrics that are averages of the two constituent model results, weighted by the number of points in their respective point clouds. In this case, the independent two-body model achieves an average point distance distribution with a median of 0.861 mm (IQR 0.553 mm). This outcome arises directly from the significant disparity in shape complexity between the scapula and proximal humerus, as well as the reduced constraints of the independent model, which does not need to account for coupled variations between the two bodies. In descriptive applications where the shapes of both bones are already provided, the increased complexity of the Combined SSM offers no additional value, as the coupled shape variations are intrinsic to the presented bodies and need not be predicted.

Model generalisation ability: number of training samples

As was the case for Descriptive Ground Truth Accuracy, the Proximal-Humerus SSM outperforms the Scapula SSM in terms of absolute error by virtue of the simpler bone shape being modelled. However, the single-body SSMs perform similarly in terms of Generalization Ability by samples in reaching a decrease in mean average point distance of less than 0.1 mm per additional training set sample after only 10 inputs (Proximal-Humerus SSM) and 11 inputs (Scapula SSM). The convergence of Scapular SSM Generalisation across the number of training samples has previously been reported to occur within 23 inputs [9]; there is no published Generalisation ability data for proximal humerus or combined two-body models for comparison purposes. The Combined SSM results are more sensitive to random variation and reach the same diminishing return per additional sample after 18 inputs. At lower training set sizes, where the description accuracy disparity between the Proximal Humerus and Scapula is widest, the Combined SSM outperforms the Scapula SSM.

Model generalisation ability: number of principal components

Generalisation Ability when varying the number of principal components included in each SSM is similar to that when varying the number of training samples except for the increased efficiency when using all training samples and varying the included number of modes. Also note that the Combined SSM performance moves in tandem with that of the Scapula SSM, rather than converging with the Scapula SSM, as was the case for Generalisation Ability when varying the number of training samples. In this case the Proximal-Humeral SSM reaches diminishing returns with less than 0.1 mm decrease in average point distance mean after 6 modes, the Scapula SSM after 7 modes, and the Combined SSM after 12 modes. The convergence of Scapular SSM Generalisation across the number of principal components has previously been reported to occur within 5 modes [9]; there is no published Generalisation ability data for proximal humerus or combined two-body models for comparison purposes.

Model specificity

The comparison of model Specificities is similar to that of Generalisation abilities; the Proximal-Humerus SSM, by virtue of the simpler modeled shape, exhibits lesser training point distances than the more complex Scapula SSM, and the Combined SSM Specificity lies in between. Scapula SSM Specificity with 7 principal components, measured as mean RMS point distance to closest training set member, has previously been reported to be 3.79 mm [9]; in comparison, we report a mean average Euclidean training set point distance of 2.46 mm. There is no published Specificity data for proximal humerus or combined two-body models for comparison purposes.

Model compactness

At least 90% of cumulative shape variation of the Proximal-Humerus and Scapula SSMs are described by the first 6 and 10 modes of variation respectively. The Proximal-Humerus SSM is less compact than those reported in the literature (requiring 4 [13], and 5 [12] modes to capture 90% cumulative shape variation), most likely owing to the pathologic shape variation that is additionally described by the Proximal-Humerus SSM whereas the others describe healthy populations. Note that typical full humerus SSMs [14, 15, 29] exhibit high compactness by virtue of the shape variance being dominated by humeral length, which is expressed in the first mode of variation, but the proximal humerus does not exhibit the same degree of length variation. The Scapula SSM compactness is within the range of that described in the literature with the number of components required to capture at least 90% cumulative shape variation reported between 5 and 11 modes [9, 14, 15, 29]. There is no published compactness data available for direct comparison with the Combined SSM. However, it is reasonable for the Combined SSM to exhibit lower compactness than both the Scapula and Proximal-Humerus SSMs, requiring a number of principal components to describe at least 90% of the cumulative shape variation that falls between that of the Scapula SSM (10) and the sum of those required by the Proximal-Humerus and Scapula SSMs (6 + 10 = 16). This is due to the relative shape complexity of the combination of both anatomical structures together, and the expectation of some degree of coupled variation between the two shapes.

SSM modes of variation

The primary modes of variation have direct anatomical measurement interpretations, indicating that the models effectively capture meaningful features and that patients undergoing RTSA have measurable characteristic scapulohumeral bony anatomy shape variation.

The primary modes of variation observed in the Scapula SSM are; Mode 1: Overall Scaling, Mode 2: Aspect Ratio (medial–lateral width with commensurately decreased superior-inferior height) and Acromial Slope [3], Mode 3: Posterior Glenoid Erosion negatively correlated with Lateral Acromial Tip Projection, Mode 4: Acromial Tilt [3] and Posterior Coracoid Tip Shift, Mode 5: Inferior Glenoid Erosion negatively correlated with Posterior Acromial Shift, and Mode 6: Posterior Glenoid Erosion with Superior Spine Shift. The primary modes of variation observed in the Scapula SSM, compared to an excerpt from the literature, are presented in Table 9. Table 9. Comparison of scapula ssm primary modes of variation**SourceThis StudySharif-Ahmadian et al. **[9]Soltanmohammadi et al. [14]Casier et al. [10]PopulationRTSA PatientsWalch B2 and B3HealthyHealthyMode 1Overall ScalingHomogeneous SizeOverall ScalingSizeMode 2Aspect RatioScapular BreadthRotation of Medial and Inferior BordersCoraco-acromial RotationMode 3Glenoid Erosion,Acromial TipGlenoid ErosionSuperior-Inferior Length,Coracoacromial ElongationAcromial ShapeMode 4Acromial Tilt,Coracoid Tip ShiftAcromial Tip Shift,Coracoid ShiftInfero-lateral Elongation,Coraco-acromial TiltSpine ShapeMode 5Glenoid Erosion,Acromial ShiftAcromial ShiftCoracoid Length,Acromial TiltAcromial OverhangMode 6Glenoid Erosion,Spine ShiftAcromial ShiftMedial Border Tilt,Coracoid Tilt

The primary modes of variation observed in the Proximal-Humerus SSM are: Mode 1: Overall Scaling, Mode 2: Lesser Tuberosity Prominence, Mode 3: Inferior Head Erosion negatively correlated with Greater Tuberosity Prominence, Mode 4: Medial Offset, Mode 5: Lateral Bowing, and Mode 6: Posterior Offset. The primary modes of variation observed in the Proximal-Humeral SSM, compared to an excerpt from the literature, are presented in Table 10. Table 10. Comparison of proximal-humerus ssm primary modes of variationSourceThis StudySintini et al. [12]**Casier et al. **[10]**Sade **[13]PopulationRTSA PatientsHealthyHealthyHealthyHumerus PortionProximalProximalFullProximalMode 1Overall ScalingUniform ScalingOverall ScalingScalingMode 2Lesser TuberosityHead Shape, Inclination, Medial Offset Axial Elongation,Head Medial RotationLesser TuberosityMode 3Head Erosion,Greater TuberosityGreater Tuberosity,Surgical NeckGirthGreater Tuberosity,Medial Neck CurvatureMode 4Medial OffsetGreater Tuberosity,Surgical NeckRetroversionMode 5Lateral BowingHead Shape,Neck OrientationLateral Bowing of ShaftMode 6Posterior Offset Posterior Offset, Greater Tuberosity, Surgical Neck

The primary modes of variation observed in the Combined SSM are: Mode 1: Overall Scale of both the Scapula and Proximal Humerus and Humeral Head Inclination, Mode 2: Scapular Fulcrum Axis, Acromial Anterior Projection and Humeral Shaft Diameter and Head Radius of Rotation, Mode 3: Scapular Aspect Ratio, Posterior Glenoid Erosion and Version, Acromial Tilt, Inferior Angle Prominence, with Humeral Head Inclination, Erosion, Medial Offset, and Greater Tuberosity Angle.

Quantification of coupled scapulohumeral variation

The scapular contributions to Combined SSM eigenvectors, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{S}$$\end{document}$ , are more ordered and less dispersed than those of the proximal humerus, $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{H}$$\end{document}$ . This is because the training set scapulae exhibit greater absolute variation than the proximal humeri. As a result, the scapular variation predominates and dictates the ordering of the Combined SSM eigenvectors, with the proximal humerus variation incorporated into each Combined SSM eigenvector based on its coupling with the scapular variation. However, the Combined SSM eigenvectors do not match the Scapula SSM eigenvectors in lockstep (and so $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\varvec{A}}}_{S}$$\end{document}$ does not form a perfect diagonal matrix) as each Scapula SSM eigenvector can be coupled to multiple Proximal-Humerus SSM eigenvectors independently; a separate Combined SSM eigenvector is required to describe each independent Scapula-Proximal-Humerus eigenvector coupling. For example, the second Combined SSM eigenvector (represented by the second row of both heatmaps Fig. 16) comprises the second Scapular SSM mode of variation coupled with the first mode of variation of the Proximal-Humeral SSM. The third Combined SSM eigenvector (i.e. the third row of both heatmaps Fig. 16) also includes the second mode of variation from the Scapular SSM but is coupled with the second mode of variation of the Proximal-Humeral SSM.

This smearing of scapular variation that is captured in one principal component of the Scapula SSM across multiple principal components of the Combined SSM is what makes the Combined SSM comparatively less compact. The coupled shape variation between the scapula and proximal humerus is significant, with up to 6.21% of total study population variation coupled in a single Combined SSM eigenvector (Mode 3) and 43.2% of total variation coupled across all Combined SSM modes of variation. This significant coupling underscores the necessity for a combined two-body SSM in generative applications, as it reveals the violation of the assumption of shape independence that is inherent in the use of separate one-body SSMs.

Prediction of missing counterpart

The covariance matrix between scapular and proximal-humeral contributions to Combined SSM model parameters is clearly sparse, but does indicate linear correlation between the earlier scapular contributions across most proximal-humeral contributions, and vice versa.

The relationship between scapular and proximal-humeral model parameters in the Combined SSM is successfully exploited in the missing body predictions with scapular average point distance error distribution median of 3.83 mm (IQR: 2.07 mm) compared to the Training Set Mean’s median of 4.26 mm (IQR: 1.64 mm) — a reduction in median error by more than 10%. Recall that Ground Truth Accuracy (Fig. 6) is a measure of an SSM’s accuracy in describing a shape that it is presented with, serving as a measure of the best possible prediction achievable (median Scapula SSM average point error of 1.08 mm). Under this interpretation, the Combined SSM improves upon the naive Training Set Mean prediction by 13.8% of the total possible prediction error reduction. This is a surprising display of bootstrapping whereby the relatively simple proximal humerus meaningfully enhances the prediction of the significantly more complex scapula.

The proximal humeri, by virtue of their relative simplicity and lesser variation, are generally more similar to the mean shape across the training set than were the scapulae. For this reason, the Combined SSM does not meaningfully alter median average prediction error from that of the Training Set Mean. However, the Combined SSM does reduce the prediction error interquartile range by over 9% with median average point distance error of 2.24 mm (IQR: 1.13 mm) compared to 2.21 mm (IQR: 1.24 mm) for the Training Set Mean.

To provide clinical context for the missing body prediction errors, arthroplasty glenoid implant pegs can have a diameter of approximately 3 mm [54], and glenoid implants are usually placed within 3 mm of the intended location when using patient-specific targeting guides [54, 55]. Implants are considered malpositioned if placed more than 4 mm from the intended location [56]. Therefore, use of the Combined SSM for missing body prediction reduces median error to a level comparable with acceptable glenoid implant placement accuracy.

De Novo generation of synthetic populations

Noting that the p-value, or plausibility, distribution should be uniform for an unbiased shape synthesis process, we see that the distribution formed by the Combined SSM-generated shapes is approximately uniform with near perfect quartiles. Oppositely, the p-value distribution formed by shapes generated by combining the one-body SSMs is biased with excessive high p-value (highly plausible) shapes, resulting in median p-value of 0.55 and 75-th percentile of .82, but also demonstrating a large proportion of implausible shapes (i.e. low p-values).

The intuitive physical interpretation of the skewed distribution is that generating scapula-proximal-humerus shape pairs using two independent SSMs produces too many synthetic shapes that are too similar to the mean shape of the training set. In comparison, the Combined SSM generates a set of synthetic shapes that is more evenly distributed across the entire range of possibilities. More diverse synthetic shapes are more likely under the Combined SSM because its individual eigenvectors encode (coupled) shape variation of both bodies such that a single extreme model parameter $\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${b}_{C,i}$$\end{document}$ begets significant variation in both the scapula and the proximal-humerus shapes. Randomly drawing a single, extreme, model parameter for the Combined SSM is more probable than drawing two separate extreme parameters—one for each of the independent SSMs—as would be necessary to generate a comparable synthetic shape pair under the independent SSM framework. This result is striking because intuitively we would expect two independent one-body models to produce mismatched, and thus low-plausibility, scapula-proximal-humerus pairs; while there is clearly a cluster of low-plausibility shapes that create a relative low-plausibility peak in the distribution for the two SSM-synthesized shapes, it is the aforementioned high-plausibility effect that dominates.

The Kolmogorov–Smirnov Test results quantifiably support the conclusion that the Combined SSM shape plausibilities are correctly distributed, and that those from the two independent one-body models are not. This means that the use of two independent one-body SSMs in generative applications, such as for computational studies of virtual surgeries [6, 57, 58], would introduce population bias that violates the underpinning assumptions of the work — their outputs are either too similar to the training set mean, or otherwise too implausible due to counterpart mismatch (ignored coupled variation), and therefore do not accurately capture the expected variation of shapes across a real clinical population.

Prospective influence on surgical decision-making and planning

The combined two-body SSM technology exemplified in this study facilitates enhanced evidence-based RTSA design parameter selection and prosthesis placement. Similarly, it informs the design of patient-specific instrumentation to improve component positioning and fixation [59]. Traditional biomechanical analysis of the shoulder has been limited by either small sample sizes—due to the labor-intensive nature of manual segmentation—or the need to assume a population-averaged humeral shape when analyzing scapular anatomy. These constraints introduce bias and reduce the fidelity of joint-level analysis, particularly in patients with atypical or pathologic anatomy. In contrast, the combined two-body SSM developed in this study enables comprehensive and anatomically grounded biomechanical evaluation through the synthesis of coupled scapular and humeral geometries. This model supports next-generation computational analyses capable of informing optimal glenosphere sizing, humeral offset and inclination angle, component lateralization, and implant positioning strategies for RTSA. Such simulations can be scaled to large synthetic cohorts, enabling robust design evaluations that reflect real-world anatomic variation [6, 9].

The developed SSM also offers a novel method to quantify disease severity, offering an objective basis for triage and supporting the identification of surgical indications in clinical decision-making. Mode 3 of the combined two-body SSM encodes the coupled pattern of glenoid erosion and humeral head erosion commonly observed in arthropathic shoulders. Leveraging this, a future study could investigate the appropriateness of using the z-score (the number of standard deviations from the mean) of a shape pair along Mode 3 of the SSM as a novel quantification of disease severity that compactly describes multiple simultaneous dimensions of pathologic shape variation. This score serves as a novel unified metric that simultaneously describes glenoid retroversion, glenoid erosion, and humeral head erosion; in the absence of this composite score, multiple individual measurements are needed, complicating both interpretation and the ability to systematically rank subjects along this multidimensional variation. This score could be integrated into artificial intelligence-powered clinical decision support systems to assist in triaging patients, informing thresholds for surgical intervention, or tracking radiographic disease progression. Moreover, the model’s ability to quantify anatomy along a continuum of degeneration may serve as a foundation for predictive algorithms assessing future joint deterioration or likelihood of requiring arthroplasty.

Another area of significant potential application for this combined two-body SSM is in guiding complex reconstruction in the setting of comminuted fractures. In cases of severe trauma, with comminuted proximal humeral or scapular fractures, the reconstruction of the original anatomy is often impeded by incomplete bony structures. The combined two-body SSM provides a powerful tool for generating plausible patient-specific pre-injury geometry by statistically inferring missing anatomy based on intact bone structures. This functionality is particularly useful for designing patient-specific fixation strategies or implants in cases where inference of exact original anatomy is not feasible. The model's coupled variation ensures anatomical coherence between reconstructed segments, thereby improving the accuracy and functional outcomes of complex shoulder reconstructions. Similar AI-assisted preoperative planning in the context of total hip arthroplasty has already demonstrated superior lower-limb length restoration compared to traditional methods [60].

In addition to its utility in planning and reconstruction, the combined two-body SSM may also support automated radiographic identification of glenohumeral arthropathy. By capturing the hallmark features of posterior glenoid wear and associated humeral head erosion within its coupled modes of variation, the model offers a data-driven basis for classifying patients according to established arthritic subtypes. Integrated into a radiographic workflow, this could enable the automated flagging of arthropathic morphology on standard shoulder imaging, reducing diagnostic variability and expediting the clinical decision-making process. Such automation aligns with recent advances in musculoskeletal imaging, where deep learning tools have demonstrated performance comparable to orthopaedic surgeons and radiologists [61] in detecting fractures [62–64], dislocations [64], osteoarthritis [64], and rotator cuff degeneration [65, 66].

Limitations

This study is limited by the small sample size of only 45 patients. However, the analysis of the model’s Generalization ability across different training set sizes revealed no significant improvement in performance beyond 18 training samples for any SSM. The use of manual segmentation and measurement of underlying CT imaging data by a single investigator, though reviewed by a second investigator, introduces potential for human error and variability.

Conclusions

This study presents the first statistical shape models of the proximal humerus and the combined two-body glenohumeral joint derived from a clinically pathologic population. The interrelationship between scapula and proximal humerus geometry is effectively captured by the Combined SSM, enabling enhanced missing body predictions in scenarios where complete imaging is unavailable. Notably, Mode 3 of the Combined model uniquely characterizes pathologic glenohumeral variation, offering potential applications in early automated disease detection, disease progression prediction, pathogenetic analysis, and the generation of pathologic synthetic populations. The necessity of combined two-body models for generating unbiased synthetic glenohumeral populations is clearly demonstrated, positioning this model as a valuable tool for advancing computational studies in shoulder biomechanics and surgical planning.

Bibliography18

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Cootes TF, Taylor CJ. Statistical models of appearance for medical image analysis and computer vision. In: Medical Imaging 2001: Image Processing. SPIE; 2001. p. 236–48. Available from: https://www.spiedigitallibrary.org/conference-proceedings-of-spie/4322/0000/Statistical-models-of-appearance-for-medical-image-analysis-and-computer/. Cited 2024 Dec 20.
2Silvestros P, Athwal GS, Giles JW. Scapular morphology variation affects reverse total shoulder arthroplasty biomechanics. A predictive simulation study using statistical and musculoskeletal shoulder models. J Orthopaedic Res. 2024;42(7):1383–98.10.1002/jor.2580138341683 · doi ↗ · pubmed ↗
3Pearl ML. Proximal humeral anatomy in shoulder arthroplasty: Implications for prosthetic design and surgical technique. J Shoulder Elbow Surg. 2005;14(1 Suppl S):99S-104S.10.1016/j.jse.2004.09.02515726095 · doi ↗ · pubmed ↗
4Salhi A. Towards a combined statistical shape and musculoskeletal modeling framework for pediatric shoulder joint. Ecole nationale supérieure Mines-Télécom Atlantique; 2019. Available from: https://theses.hal.science/tel-02414431. Cited 2024 Dec 19.
5Sade P. Statistical Shape Modeling to Quantify Variation in the Proximal Humeral Anatomy. Electronic Theses and Dissertations. 2017 Jan 1; Available from: https://digitalcommons.du.edu/etd/1289.
6Inyang AO, Fouefack JR, Sivarasu S, Roche S, Borotikar B, Burdin V, et al. Assessment of 3D morphological characteristics of the shoulder bones using statistical shape modeling: Prospective application to handedness. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). 2017. p. 1629–32. Available from: https://ieeexplore.ieee.org/abstract/document/8037151. Cited 2024 Dec 19.10.1109/EMBC.2017.803715129060195 · doi ↗ · pubmed ↗
7Fouefack JR, Dardenne G, Borotikar B, Mutsvangwa T, Burdin V. 3D reconstruction of joints from partial data using multi-object-based model: Towards a patient-specific knee implant design. In: Proceedings of The 20th Annual Meeting of the International Society for Computer Assisted Orthopaedic Surgery. 2022. p. 69–72. Available from: https://easychair.org/publications/paper/lxrz. Cited 2024 Dec 19.
8Fouefack JR. Towards a framework for multi class statistical modelling of shape, intensity, and kinematics in medical images. Ecole nationale supérieure Mines-Télécom Atlantique ; University of Cape Town. Faculty of Health Sciences; 2021. Available from: https://theses.hal.science/tel-03506272. Cited 2024 Dec 20.