Robust automated method of spatial resolution measurement in radiotherapy CT simulation images

Pavel Govyadinov; Rick. R. Layman; Tucker Netherton; Raymond Mumme; Aaron. K. Jones; Laurence. E. Court; Moiz Ahmad

PMC · DOI:10.1002/acm2.70006·February 13, 2025

Robust automated method of spatial resolution measurement in radiotherapy CT simulation images

Pavel Govyadinov, Rick. R. Layman, Tucker Netherton, Raymond Mumme, Aaron. K. Jones, Laurence. E. Court, Moiz Ahmad

PDF

Open Access

TL;DR

This paper introduces an improved automated method for measuring spatial resolution in CT images used for radiotherapy planning, making it more reliable across different conditions.

Contribution

The novel contribution is a modified algorithm that improves robustness to patient positioning, CT protocols, and artifacts.

Findings

01

The modified algorithm showed lower RMS error (3.8%) compared to the reference algorithm (10.3%) in cylindrical phantom datasets.

02

In anthropomorphic phantoms, the modified algorithm had smaller percent differences from ground truth (4.9% and 3.9%) compared to the reference algorithm (8.4% and 30.0%).

03

The modified algorithm estimated clinical dataset values closer to ground truth (0.28 mm−1) than the reference algorithm.

Abstract

Variation in imaging protocol, patient positioning, and the presence of artifacts can vary image quality in CT images used for radiotherapy planning. Automated methods for spatial resolution (SR) estimation exist but require further investigation and validation for wider adoption. To validated previously existing algorithm for SR estimation and introduce improvements that make it robust to patient positioning, CT protocol, site, and artifacts. A reference algorithm based on the previous gold standard was recreated and modified to improve robustness. The algorithms were tested on three different datasets: (1) a cylindrical ACR CT QC phantom scanned using a Siemens SOMATOM Definition Edge scanner and reconstructed using 61 different kernels, (2) a set of anthropomorphic phantoms scanned with the presence of artifacts common to clinical acquisitions such as blankets and immobilization…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Diseases1

HN

Figures7

Click any figure to enlarge with its caption.

Visual description of the algorithm. Each individual ESF curve is tail‐replaced (a), centered (b, c). The curves that pass the filters are then binned and conditioned to generate the oESF curve (d), a derivative LSF (e) and an fft is applied in order to generate the RI (f). The ESFs and all further derivatives are for extracted from a spine patient in the clinical data presented in this paper.

Figure showing the immobilization device problem and our implemented solution. When an immobilization device is present (a), many samples pass through the mask elements which is a part of segmentation (b). To improve the accuracy, we limit the sampling area in the HN protocol to the lateral regions of the head and around the ear (c), where the mask makes less contact with the skin and the mask cells are larger. The arrow in (c) designates samples in the ear region that have passed the in‐plane filter and will be processed by the pre‐centering and post‐centering filters.

Accuracy of the modified and reference resolution measurement algorithms over a range of image reconstruction kernels with varying spatial resolution (modulation transfer function f50 value). Mean error (solid) and ±2STD (dashed) lines are also plotted. The typical f50 for simulation performed with field of view of 50 cm is labeled on the plot in green.

Box plots showing the results of the patient data ran on nine HN scan patients (a) and 10 pelvis/spine scan patients (b). With all the parameters matching between each patient in each group and the respective anthropomorphic phantom, the ground truth (GT) was taken to be at the value computed using the Friedman phantom; represented in orange in each plot.

A visual representation of the patient testing sets for HN (a) and pelvis/spine scans (b). The volumetric data as well as the mesh generated using Otsu thresholding and marching cubes (in white) is visualized for both patient sets. Head and Neck meshes are shown post‐table crop, but prior to any additional cropping, and faces have been blurred due to privacy concerns. A table is provided underneath each set denoting the spatial resolution f50 value generated using the reference algorithm (top row) and the modified algorithm (bottom row).

A comparison of the oESF from the modified and reference algorithms along with the first 1 million samples from the individual ESF points before binning, extracted from a head neck patient. Both oESF's have been conditioned. The reference algorithm tends to accept a large number of curves that pass through the mask, passing the clothing filter, and distorting the final oESF leading to inaccuracies in the SR readings in HN patients.

An example of a mesh with (a) and without (b) smoothing both with insets of approximately the same area around the ear. The smoothing operation reduces the size of the individual mesh elements by ≈15%, resulting in a proportional increase of the number of samples. This also results in a smaller inter‐sample distance, but minimizes the number of jagged cubic faces resulting in overall improvement in sample quality.

Tables3

TABLE 1. Table describing the filtering performed on each individual ESF curve.

Pre‐centering
In air artifact media filtering	$\sum_{x_{0}}^{x_{a_{s} .}} Δ E_{m} (x) < - 100$	Detects extraneous objects in the air tail of each individual curve
Ancillary region filtering	$\sum_{x_{a_{e}}}^{x_{N}} Δ E_{m} (x) < - 300$	Detects air‐tissue‐air and similar transitions as well as noisy ESF curves in sharper kernel reconstructions
Transition region check	$(\begin{matrix} \begin{matrix} x_{n} > N * 0.2 \\ x_{n} < N * 0.8 \end{matrix} \\ E_{m} (x_{n}) < E_{m} (x_{n + 1}) \end{matrix}) w h e r e x_{n} \in X_{t r a n s i t i o n}$	Checks whether the transition region is roughly in the center of the entire ESF based on the number of samples N and the transition region is strictly increasing.
Clothing*	$E_{m} (x_{a i r}) < - 950$	Detects clothing (in close contact with skin??)
In‐plane normal*	$\cos^{- 1} (\frac{\vec{x}}{\| x \|} \times \vec{xy}) < 10^{\circ}$	Detects whether the Bresenham sample passes between slices

TABLE 2. A table describing the each ESF inclusion filter performed on each trimmed and centered ESF curve.

Post‐centering
Bounding box	$\begin{matrix} {X_{m}}^{'} \in [- 20, 20] \\ E_{m}^{'} \in [- 600, 600] \end{matrix}$	Checks whether the a,b,c,d coefficients have been calculated correctly
Max‐Min curve definition	$\max (E_{m}^{'}) - \min (E_{m}^{'}) \in$ [900, 1200]	Checks whether the air‐to‐skin intensity difference is within the expected range.

TABLE 3. An in‐depth analysis of the effects of each filter on the f50 estimation.

Keywords

CTimage reconstructionimage quality assessmentspatial resolutionquality control

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Radiotherapy Techniques · Radiation Dose and Imaging · Medical Imaging Techniques and Applications

Full text

INTRODUCTION

1

Computed tomography (CT) image quality characteristics, such as spatial resolution (SR), noise, and contrast, are generally characterized using phantoms as part of quality control (QC). Clinical image quality is inferred from the QC results. However, both the imaged subject and imaging parameters in phantom‐based image quality tests may differ from clinical imaging in ways that can affect clinical applications, such as auto‐contouring in radiation therapy.1, 2 Variability in imaging protocols further confound the issue resulting in statistically significant variations in image quality among imaging facilities.3 Moreover, many clinics may not have the phantoms, personnel, and resources to perform QC measurements of certain image quality metrics, exacerbated by a lack of a unified image acquisition standard.4, 5, 6 There is considerable interest in remote automated treatment planning especially to assist radiation oncology in low and middle‐income countries (LMICs). In such setting, quality control on acquired CT simulation images at a remote LMIC site is a significant and challenging problem. For instance, automated treatment planning may use automated anatomical segmentation, and the quality of the segmentation may depend on the quality of CT simulation images.1, 7 Therefore, there is a need to develop automated QC and standards to assure consistent and high‐quality clinical images required for clinical applications such as auto‐contouring in radiation therapy. Currently there are a multitude of automated methods available for calculating SR with high precision using cylindrical phantoms8, 9, 10 and a typical QC process for CT scanners involves regularly performing phantom tests to ascertain if the device is performing within a tolerance range. Such QC, however, assumes ideal imaging conditions that may not account for subject‐specific variations. Nor is it practical to test a multitude of clinically used imaging protocols with phantom tests.

A method of measuring the SR metric of modulation transfer function (MTF) directly in clinical CT images has been previously presented11 and extended to radiologist preference12 and mathematical observer.13 The method measures the sharpness of the skin‐air interface within CT images. Although successfully applied to diagnostic CT examinations, the unique challenges associated with radiotherapy CT simulation has not been reported. Application to CT simulation of radiotherapy poses unique challenges. Specifically, the presence of immobilization devices common to radiotherapy may interfere with the assessment of the skin‐air interface and produce erroneous measurements.

The purpose of this study was to improve and validate automated techniques for evaluating the SR of clinical CT imaging, paying special attention to robustness over patient setup commonly used in radiotherapy CT simulations. For instance, clinical data are subject to a multitude of artifacts not present in ideal, uniform phantoms, such as from clothing, immobilization devices, and pacemakers. An update of a previously published reference method of Sanders et al. was implemented and tested in various phantoms and clinical radiotherapy CT simulation image sets. This work provides several important algorithmic contributions that robustly filter out invalid samples of the patient skin to air edge‐spread function. These contributions are shown to improve resolution measurement accuracy over the reference method in challenging clinical conditions, including attenuating external markers and immobilization devices. In addition, the robustness of the method across widely varying image reconstruction filters was demonstrated on a phantom.

MATERIALS AND METHODS

2

A brief review of the reference method of Sanders et al. is provided. Given a set of CT images, the reference method first segments the patient from the external space (air, CT table) using Otsu thresholding.14 The volume is meshed using a tetrahedral volumetric mesh and the air–skin interface is designated as the tetrahedrons on the surface of the mesh. The edge spread function (ESF) of the skin‐air interface is sampled at a multitude of points along the surface mesh. A tail‐replacement procedure is used to correct varying skin image intensity with depth; the resulting ESFs are normalized, centered, and aggregated into a global ESF. A regularization procedure smooths the ESF and eliminates spurious signals. Finally, a Fourier transform of the regularized global ESF produces a global modulation transfer function (MTF) characterizing the CT scan resolution.

The reference method was implemented with key modifications. The meshing strategy was updated to use marching cubes15 to generate a triangular mesh and extract the air‐to‐skin interface. We implemented additional computationally low‐cost filters to remove invalid individual ESF curves. We detected extraneous objects such as immobilization masks in the ESF tail. Additionally, during the logistic function fit we implemented simple thresholding‐based detection to remove the incorrect ESF curves. Additionally, we trim the tails directly in the globally aggregate oversampled ESF curves since the tails hold no information about the transition through air‐to‐skin.

Algorithm description

2.1

Given a set of CT images, the algorithm extracts the air–skin interface mesh using the marching cubes on an Otsu mask generated from the CT dataset. Due to the nature of marching cubes, each element of the resulting mesh is smaller or equal to the size of the voxel. This mesh is then over‐sampled in head and neck (HN) patients to increase the number of ESF samples. Perpendicular Bresenham lines16 are generated from the center of each triangle mesh component. Image pixel values are sampled along the Bresenham lines without the use of interpolation to generate a set of individual edge spread function (ESF) curves. The distance of each pixel in the ESF from the skin‐air interface was calculated using a kdtree, identical to Sanders et al.

Each ESF curve undergoes a tail replacement and centering operations, and a set of filters is applied to remove invalid ESF curves. The remaining ESF curves are aggregated using binning and regularized to generate a single oversampled ESF (oESF). From the oESF, the algorithm computes a line spread function and modulation transfer function (MTF). Finally, we return the calculated f50 value of the MTF. The pseudo‐code for the algorithm is provided in Algorithm 1 and a visual example is presented in Figure 1:

Visual description of the algorithm. Each individual ESF curve is tail‐replaced (a), centered (b, c). The curves that pass the filters are then binned and conditioned to generate the oESF curve (d), a derivative LSF (e) and an fft is applied in order to generate the RI (f). The ESFs and all further derivatives are for extracted from a spine patient in the clinical data presented in this paper.

ALGORITHM 1Modified spatial resolution approximation

Input: A stack of CT images in DICOM format C with voxel spacing s_x_, s_y,_ s_z_:M = GenerateOtsuMask(C)(generated a masked volume)Surface = MarchingCubes(M)(generate a surface mesh) for every triangle t in Surface do l = BresenhamLine (normal to t)(generate Bresenham line)ESF = SampleVolume(C, l)(sample I intersect C)ESF’ = precentering_filters(ESF)(apply pre‐centering filters)ESF_centered_ = center(ESF^'^)ESF’centered = postcentering_filters(ESF_centered_)(apply post‐centering filters) end oESF = binning(0.1*s_x_)(generate oESF)oESF’ = isotonicRegression(oESF)(apply smoothing)LSF = ∆oESF’(calculate LSF)MTF = fft(LSF)(calculate MTF) Return calculated MTF(f50)(calculated f50)John Wiley & Sons, Ltd.

Mesh generation

2.2

Mesh generation is done using pyvista, a pythonic Visualization Toolkit implementation.17 The meshes are smoothed to deal with larger voxel sizes. Smoothing is performed using the Taubin algorithm to remove high curvature variations while preserving the total volume.18 We perform 100 iterations of the algorithm with a band pass filter coefficient of 0.05.

In all cases, the dataset is truncated. We crop 10 mm at both longitudinal ends of the image volume to avoid calculation errors from insufficient out of axial plane sampling. Additionally, we crop the dataset to ensure that the individual curves do not intersect with the table. This is done by calculating the center point of the reconstructed volume and projecting a horizontal plane approximately 50 mm away from the center point toward the table. The volume is cropped below that plane. The direction of all the crop operations and the normal of the plane is defined by the image coordinate system defined in the DICOM image metadata. (We noted the crop plane distance as approximately 50, since the actual distance away from the centerline was 50 pixels multiplied by the x, y pixel spacing parameter, which was 500‐mm/512 pixel = 0.977 mm/pixel in nearly all our clinical scans).

Robust filtering

2.3

In addition to the filters presented in Sanders et al., our method uses additional filters to remove invalid individual ESF curves prior to aggregation. These filtering routines are broken up into two stages: pre‐centering and post‐centering.

An individual ESF curve Em(x) is composed of a set of N samples X={x0…xN}. During the tail‐replacement stage of the algorithm, the set X is decomposed into three sets: Xair={x0.…xa0},Xtransition={xas…xae},Xtissue={xae…xN}. Where x0,xas,xae,xN, are the first, first transitional, last transitional and last samples in the ESF curve, respectively. Xair,Xtransition,Xtissue are identified using a transformation of ΔEm to a function Dm (Equation 1).

[eqn]

where ∆E_m_ (x) is the first derivative of the m‐th ESF curve, and E_m_ (x) is an ESF curve in Hounsfield Units (HU) as a function of distance. The transformation of ∆E_m_ to D_m_ makes ∆E_m_ sparse by suppressing small values; this helps with robustness of identifying the transition region. The threshold value of 20 HU was found experimentally and is valid under the assumption that the first derivatives in the air and tissue tails are generally stable.

While this assumption holds for the majority of the ESF curves, an individual ESF curve may include transition between multiple materials, such as mask, blankets, clothing, or ancillary regions. For such cases, we implement additional filters that check air, tissue, and transition regions for incongruencies. We also account for errors in surface–air interface segmentation. All pre‐centering filters and their use cases are identified in Table 1. All other curves proceed to the tail‐replacement stage of the algorithm.

The centering stage of the algorithm is implemented using the scipy optimize 19 package and verifies correct centering and tail replacements. To bring all ESF curves into the same coordinate space, each set of samples is fitted to a logistic regression function (Equation 2)

[eqn]

where a,b,c,d are the fitted coefficients initialized to [1000, 0, 1.0, −1000], respectively. The approximated coefficients, a,b, and d are then applied to center the curve about the origin (Equation 3).

[eqn]

where Xm′, Em′ are the location and amplitude values of the transformed m‐th ESF curve. We then test that the curves all exist within a common bounding box, and are not deformed. Full descriptions of the post‐centering filters are defined in Table 2.

Aggregation and post‐aggregation

2.4

Once the centering subroutine completes, the algorithm generates the oversampled ESF (oESF) curve from an aggregation of individual ESF curves. The points from every individual centered ESF are binned together, with an aggregate bin value equal to the mean of ESF samples in the bin. The bin size is set to 10% of the pixel size in the x, y direction. Additionally, we isolate the center 70% of the curve containing the transition region and eliminate the air and tissue ends prior to the LSF transform. The resulting oESF is then conditioned using isotonic regression20 and transformed into a Line Spread Function by taking a derivative. Finally, a Fast Fourier transform is performed in order to generate the Modulation Transfer Function and attain the estimate of the f_50_ SR metric.

Special algorithm for head and neck CT

2.5

Additional modifications were implemented for HN images where the patient is immobilized using a thermoplastic mask. The sampling area is limited by removing any mesh elements posterior and anterior of the ear region (determined to be roughly in the center of the CT image stack). Specifically, we truncate the image volume to a 75 voxel wide slab in the anterior‐posterior direction (Figure 2c). Finally, we oversample the clipped region by a factor of 2 to extract the maximum number of ESF curves from the limited region. This extra operation is done only in the case of the HN image acquisitions.

Figure showing the immobilization device problem and our implemented solution. When an immobilization device is present (a), many samples pass through the mask elements which is a part of segmentation (b). To improve the accuracy, we limit the sampling area in the HN protocol to the lateral regions of the head and around the ear (c), where the mask makes less contact with the skin and the mask cells are larger. The arrow in (c) designates samples in the ear region that have passed the in‐plane filter and will be processed by the pre‐centering and post‐centering filters.

Uniform phantom

2.6

A uniform cylindrical phantom, ACR CT QC phantom: Module 3 – uniformity, was scanned on the Siemens SOMATOM Definition Edge (Siemens Healthineers, Forchheim, Germany) scanner with the following acquisition parameters: 120 kV, 240 mA, 38.4 mm, 500 ms for the Abdomen(protocol) and 120 kV, 400 mA, 38.4 mm, 500 ms for the AbdSeq protocols for values of kV, mA, beam width and rotation time, respectively. Reconstructed images of the phantom were generated with filtered back‐projection algorithms, using 61 different convolution filters and creating 122 data points. These kernels represent the entire range of filters on the system ranging from the smoothest [Br32] to the sharpest [Bl57]. For a full table of filters please see Appendix A

The SR of the resulting reconstructions was evaluated specifically for ACR phantom analysis using: (1) our experimental method and (2) the reference method of Friedman et al. Both methods produce an MTF, and the MTF produced by the Friedman method was taken as ground truth. The MTF‐50 value, the spatial frequency at which the MTF equals 0.5, was derived from both methods The accuracy of the MTF‐50 value was derived from our experimental method and compared to the ground truth.

Anthropomorphic phantom

2.7

The experimental method was tested on CT scans of two different customized anthropomorphic phantoms with various devices typically used in radiotherapy CT simulation. These two anthropomorphic phantoms represented two different body regions in an average size adult male: HN, and abdomen & pelvis. The phantoms were manufactured by The Phantom Laboratory (Salem, NY, USA) and scanned using a Brilliance Big Bore Scanner (Phillips CT, Amsterdam, Netherlands). Since our method focuses on the interface between the phantom surface and air, our methods do not depend on the internal contents of the phantoms.

Additional devices were placed external to the phantom to represent typical patient setup in CT simulation. These setups represent challenges to our method which attempts to measure the air–skin interface. For the HN phantom, a custom thermoplastic immobilization mask was placed on the phantom that conformed to the head and shoulder (Figure 2). For the abdomen and pelvis phantom, localization fiducial markers were placed anteriorly and laterally centered on the phantom with a cotton towel folded over onto itself (i.e., two layers) over the groin (Figure 2a). All anthropomorphic phantoms were set up on a flat carbon‐fiber tabletop used in radiotherapy CT simulations.

For both of these scenarios, corresponding scans of a CTDI phantom, a 32‐cm diameter cylinder of acrylic, were acquired using identical imaging parameters (120 kV, 383 mA, 12 mm, 500 ms, UB kernel for kV, mA, beam width, rotation time reconstruction kernel, respectively) as in the anthropomorphic phantom scans. The MTF‐50 value of the CTDI phantom image was calculated using the reference Friedman method and taken as ground truth.

The SR in each of the anthropomorphic phantom scans was calculated using both the Sanders method and our updated method. The accuracy of both of these methods was calculated and compared to the ground truth.

Clinical data

2.8

Clinical images of patients undergoing standard of care radiotherapy CT simulation were acquired and used to test our method. Images from nine randomly selected patients receiving CT simulation prior to a course of radiation therapy for HN radiotherapy were accessed. Images from eight randomly selected patients receiving CT simulation for either pelvic radiotherapy and two patients receiving lumbar spine radiotherapy were accessed. The pelvic and spine cases used equivalent setup, immobilization, imaged anatomy, and imaging protocol; therefore, we have grouped these cases together into a cohort of 10 pelvis/spine cases). The image data from these CT simulation procedures were used retrospectively for this research study. The retrospective data collection was approved by our Institutional Review Board and patient informed consent was waived. Patient datasets had diverse demographic, positioning and height and included juvenile patients to test the robustness of the algorithm. All HN patients had a thermoplastic immobilization mask of some kind during acquisition.

Both the experimental and reference methods were used to measure SR (MTF‐50) in these data sets. The MTF‐50 acquired using the Friedman algorithm in the 32‐cm phantom during the corresponding protocol was taken as the ground truth.

Filter sensitivity study

2.9

In order to ascertain the sensitivity of our modified algorithm to parameter changes, we performed a study of the sensitivity of our measurement in clinical cases to the various ESF filters used in our method. We examined how many ESF curves each filter removes, and the effect of these filters on the final f50 measurement. Since the filters mimic the swiss cheese model where each consecutive filter is likely to catch any ESF curves the previous filters missed, we adopted an on‐off, one‐all strategy. To do this, we performed the f50 estimation with only a single filter at a time, leaving all the curves that other filters would otherwise catch and eliminate. The rest of the execution proceeded as described in Algorithm 1.

We also studied the effect of resampling the mesh on the resulting f50 approximation. Resampling the mesh has an obvious performance advantage at the cost of accuracy, and the mesh resampling coefficient is the singular tunable parameter during the execution. The adaptive resampling algorithm attempts to retain the total volume of the mesh, by reduce the number of triangle elements by a set percentage. We ran our patient datasets through the entire algorithm using resampling values of 100%–10% in increments of 10%.

RESULTS

3

Uniform and anthropomorphic phantoms

3.1

Both algorithms were accurate when ground truth MTF f50 was within the range of 0.2–0.7 mm^−1^. Twenty out of 122 reconstruction filters were deemed outliers due to their specific use. For example, some of the sharper reconstruction filters designed for highlighting potential anomalies in bone tissue tended to amplify the noise in the phantom‐air (skin–air) interface resulting in incorrect segmentation and noisy ESF curves that were mostly rejected during filtering. With the outliers removed the modified and reference algorithms had a root‐mean‐square error (RMS) of 3.8% and 10.3% with a mean error of 1.5% and 3.5%, respectively. Figure 3 shows the performance of the two algorithms for the reconstruction kernels without the outliers, note that both algorithms became somewhat less reliable when the ground truth MTF f50 was above 0.7 mm^−1^.

Accuracy of the modified and reference resolution measurement algorithms over a range of image reconstruction kernels with varying spatial resolution (modulation transfer function f50 value). Mean error (solid) and ±2STD (dashed) lines are also plotted. The typical f50 for simulation performed with field of view of 50 cm is labeled on the plot in green.

The reference algorithm performed well on pelvis/spine CT scans, showing 8.4% difference as compared to the ground truth calculated on the 32‐cm CDTI phantom. On the HN phantom, the reference algorithm had a 30.0% difference, respectively. The modified algorithm performed better, with a 4.9%, 3.9% difference in pelvis and HN, respectively.

Clinical data

3.2

Box plots in Figure 4 present how the two algorithms performed on the patient datasets in both HN and pelvis/spine groups. With the ground truth set at the Friedman‐calculated value of the cylindrical phantoms of 0.28 mm^−1^ and 0.28 mm^−1^ for the HN and pelvis/spine, respectively, the reference version underestimated MTF in the patient population, with a mean (± population standard deviation) of 0.21 ± 0.03 mm^−1^ and median of 0.20 mm^−1^ for HN and mean 0.25 ± 0.03 mm^−1^ and median of 0.26 mm^−1^ for pelvis/spine. The modified version of the algorithm showed a mean of 0.28 ± 0.02 mm^−1^ and median of 0.29 mm^−1^ for HN and 0.29 ± 0.01 mm^−1^ and median of 0.29 mm^−1^. We assume stationarity of SR in the image, and while this is not strictly true since resolution decreases with increasing distance from isocenter, the non‐stationary effect is negligible compared to our measurement uncertainty. Secondly, we assume that there is no difference in reconstructed image resolution due to material composition differences between phantom and patient. This assumption is true for linear reconstruction (filtered back‐projection), which was used in the study. This assumption may not necessarily hold for non‐linear reconstruction: iterative or deep‐learning reconstruction.

Box plots showing the results of the patient data ran on nine HN scan patients (a) and 10 pelvis/spine scan patients (b). With all the parameters matching between each patient in each group and the respective anthropomorphic phantom, the ground truth (GT) was taken to be at the value computed using the Friedman phantom; represented in orange in each plot.

Filter sensitivity study

3.3

The number of ESF curves eliminated and the resulting accuracy of the method is presented in Table 3 and presented along an example invalid ESF curve removed by each of the filters.

The results of the study of mesh resampling are shown in Figure 5.

FIGURE 5The median percentage difference of the algorithm approximation from ground truth as a function of the resampling coefficient.

DISCUSSION

4

In summary, in this work we replicated the SR estimation algorithm introduced by Sanders et al. 10 and then improved the algorithm with new surface meshing algorithms and edge spread function filters in order to improve accuracy in radiotherapy CT simulation, including HN cases. We compared the performance of our modifications as compared the original work on cylindrical and anthropomorphic phantoms as well as a clinical dataset. Finally, we performed an evaluation of the individual filters as well as mesh quality on the algorithm's accuracy. Overall, we found that while our improvements improve the results certain imaging conditions still present a challenge to both versions of the algorithm.

The presented method is expected to be helpful in auditing SR in individual CT exams for which a corresponding phantom QC test is unavailable or is inconvenient to perform. An example situation is radiotherapy treatment planning using CT simulation imaging performed at a remote location.

The following discussion describes specific aspects of the performance of the method. The initial testing on various phantom reconstructions concurs with the results of the reference work, the algorithm is indeed sensitive to reconstruction kernel, and is exemplary at estimating SR effectively with commonly used kernels. When sharper kernels (f50 > 0.60 mm^−1^) are used, the resulting noise amplification led to over‐rejection of ESF curves in the modified algorithms, and the f50 estimate becomes unreliable. Trimming the in‐air and in‐tissue tails after the oESF was generated minimized this sensitivity in the reconstructions with the very sharp kernels in the phantom datasets, but despite this the resulting f50 is unreliable due to noise and over‐filtering due to our modifications. These results agree with the hypothesized limitations of this approach stated in the reference work and although this was observed, these reconstruction kernels have a higher sharpness according to f50 than typically used in radiotherapy simulation.

The reference algorithm performed as expected on the clinical pelvis/spine datasets, with a slight tendency to underestimate the ground‐truth f‐50 value, as reported in the reference work. Furthermore, both algorithms had a smaller f‐50 as compared to the ground truth in the case of larger patients, where most of the samples fell outside of the 16‐cm radial bin (Figure 6). In the HN dataset, the modified version of the algorithm performed much better, due to the addition of the new filters, Taubin smoothing, and sampling under the mask ensuring only the best ESF curves are passed to the aggregation stage.

A visual representation of the patient testing sets for HN (a) and pelvis/spine scans (b). The volumetric data as well as the mesh generated using Otsu thresholding and marching cubes (in white) is visualized for both patient sets. Head and Neck meshes are shown post‐table crop, but prior to any additional cropping, and faces have been blurred due to privacy concerns. A table is provided underneath each set denoting the spatial resolution f50 value generated using the reference algorithm (top row) and the modified algorithm (bottom row).

Additionally, as stated in the reference work, mesh generation can also be a challenge for the algorithm. This is particularly prevalent in HN cases, where facial features and transitions from shoulders tend to be very sharp resulting in virtually no samples in those areas, and what samples are left tend to pass through the immobilization device (Figure 7). This is not an issue in pelvis/spine CT simulation scans where patient positioning tends to lead to an abundance of flat areas of skin where the sampling strategy and mesh quality does not matter and all the samples pass the ±10° filter, consequently both versions of the algorithm perform well on areas of the mesh where curvature is low, such as chest, stomach and legs. Taubin smoothing and mesh truncation was essential in reducing the number of regions in the mesh where curvature increases sharply over just a few mesh elements while preserving the volume captured by the mesh. As shown in Figure 8, the smoothing operation does reduce the size of the mesh elements by approximately 15% resulting in a proportional increase in the number of total aggregated samples. Some improvement is due to the decrease in the number of jagged edges due to the voxelization from the thresholding method resulting in an overall improvement of the individual ESF quality. In the HN datasets, due to the presence of immobilization devices, a large number of the samples collected from the head and shoulder regions of a patient were discarded due to intersection with immobilization devices. The truncation procedure focuses the algorithm on the regions where intersections are less likely to occur, leading to further improvement in accuracy for HN patients.

A comparison of the oESF from the modified and reference algorithms along with the first 1 million samples from the individual ESF points before binning, extracted from a head neck patient. Both oESF's have been conditioned. The reference algorithm tends to accept a large number of curves that pass through the mask, passing the clothing filter, and distorting the final oESF leading to inaccuracies in the SR readings in HN patients.

An example of a mesh with (a) and without (b) smoothing both with insets of approximately the same area around the ear. The smoothing operation reduces the size of the individual mesh elements by ≈15%, resulting in a proportional increase of the number of samples. This also results in a smaller inter‐sample distance, but minimizes the number of jagged cubic faces resulting in overall improvement in sample quality.

The in‐air artifact media filter performed similarly to the clothing filter but tended to identify fluctuations where the raw HU value did not exceed –925 H.U., but fluctuated in amplitude, nevertheless. The ancillary region filter would identify ESF curves that would otherwise go through onto the centering stage of the algorithm. Depending on the length of the tail, such curves also passed the centering stage since the tail replacement portion of the algorithm only looked for the ascending portion of the curve to find the edge transition. There are many such ESF curves in the HN CT scans as shown in the nose, ear and mouth regions (Figure 8). The transition region filter removes curves that have unequally short or long tail sections, preventing problems further down the pipeline during the centering stage. These curves were usually artifacts of the meshing process caused either by the HU amplitude or mesh smoothing, resulting in displacement of the air–skin interface by a few voxels. Both post‐centering filters exist primarily to deal with outliers as a result of noise from sharper reconstruction kernels, regions where the skin is thin and the ESF transitions into bone resulting in a preserved transition region, but incorrect tail replacement. These filters allow for accurate estimation of the MTF across the range of CT simulation acquisition parameters, reconstruction parameters, and patient positions encountered in the clinic.

As the mesh is down‐sampled the accuracy of the algorithm on HN CT decreases, but for scans of the pelvis and spine the accuracy remains constant. Furthermore, there is some additional error due to new behavior in the mesh. One important filter in the original work was the in‐slice filter, which ensured that each ESF existed within a single slice by making sure that the angle of the Bresenham line was ±10°. Resampling the mesh resulted in samples passing the filter but transgressing the artificial slice borders, as the triangle centers now exist very close to slice borders, resulting in some jitter in accuracy. Overall, keeping 50% of the mesh was an acceptable option for running the algorithm when computational resources are limited, but retaining 100% of the mesh guarantees the lowest error.

There were some limitations to our study. One limitation of the method is the monotonicity constraint, which may result in MTF measurement error for reconstruction filters that have truly non‐monotonic MTF such as filters with edge‐enhancement. Additionally, while our algorithm has shown good results in approximating the gold standard SR for scans of the HN, we have not thoroughly tested our sampling strategy on immobilization devices of different types or from different vendors. Certain vendors produce immobilization devices that are specifically designed to adhere to the patient's head as closely as possible, and in those cases the modified algorithm may not be as accurate due to over filtering. A more consistent strategy of sampling between the sampling cells may be more appropriate. More novel methods might be needed for solid HN immobilization devices with no mesh structure. Additionally, we have not tested the algorithm in clinical scenarios with a variety of reconstruction filters and different voxel sizes, only in test phantoms. While the performance of the algorithm was stable for a cylindrical phantom, this performance may not translate to clinical data, therefore further study is needed.

CONCLUSION

5

In this study, the experimental method of CT image SR measurement was tested, validated, and improved for use in a range of clinical radiotherapy CT simulation procedures. Both algorithms were tested against the ground truth in cylindrical and anthropomorphic phantoms as well as a variety of clinical acquisitions. The reference algorithm was improved through the implementation of additional filters that function similarly to the Swiss cheese model to filter out invalid ESF samples. Although each individual filter does not remove every invalid ESF sample, the combination of filters is effective in producing a valid set of ESF samples, and therefore an accurate MTF measurement. The method may be robust to a variety of inter‐patient differences such as positioning and presence of immobilization devices. The presented method may be used as a surrogate of a traditional resolution measurement taken in phantom.

AUTHOR CONTRIBUTIONS

The work was conceptualized by Laurence Court, Rick Layman and Ahmad Moiz. Pavel Govyadinov contributed to phantom and patient data analysis, algorithm development and testing. Rick Layman, Tucker Netherton, Ahmad Moiz and Aaron Kyle Jones performed the phantom data acquisition. Raymond Mumme performed the patient data curation. All authors contributed intellectual content the writing of the manuscript and had final approval.

CONFLICT OF INTEREST STATEMENT

The authors have no conflict to disclose.

Supporting information

Supporting Information

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Huang K , Rhee DJ , Ger R , et al. Impact of slice thickness, pixel size, and CT dose on the performance of automatic contouring algorithms. J Appl Clin Med Phys. 2021;22(5):168‐174. doi:10.1002/acm 2.13207 PMC 813022333779037 · doi ↗ · pubmed ↗
2Wu X , Udupa JK , Tong Y , et al. Auto‐contouring via automatic anatomy recognition of organs at risk in head and neck cancer on CT images. In: Webster RJ , Fei B , eds. Medical Imaging 2018: Image‐Guided Procedures, Robotic Interventions, and Modeling, Volume 10576. Published online March 13, 2018. doi:10.1117/12.2293946 PMC 612285730190629 · doi ↗ · pubmed ↗
3Smith TB , Zhang S , Erkanli A , Frush D , Samei E . Variability in image quality and radiation dose within and across 97 medical facilities. J. Med Imaging. 2021;8(5). doi:10.1117/1.jmi.8.5.052105 PMC 810561333977114 · doi ↗ · pubmed ↗
4Van Dyk J , Meghzifene A . Radiation oncology quality and safety considerations in low‐resource settings: a medical physics perspective. Semin Radiat Oncol. 2017;27(2):124‐135. doi:10.1016/j.semradonc.2016.11.004 28325238 · doi ↗ · pubmed ↗
5Meghzifene A . Medical physics challenges for the implementation of quality assurance programmes in radiation oncology. Clin Oncol. 2017;29(2):116‐119. doi:10.1016/j.clon.2016.10.008 27847135 · doi ↗ · pubmed ↗
6Dobbins JT , Frush DP , Christopher , et al. Medical Imaging Safety in Global Health Radiology. Springer e Books. Published online January 1, 2019:85‐105. doi:10.1007/978-3-319-98485-8_9 · doi ↗
7Miller C , Mittelstaedt D , Black N , et al. Impact of CT reconstruction algorithm on auto‐segmentation performance. J Appl Clin Med Phys. 2019;20(9):95‐103. doi:10.1002/acm 2.12710 PMC 675374131538718 · doi ↗ · pubmed ↗
8Anam C , Fujibuchi T , Budi WS , Haryanto F , Dougherty G . An algorithm for automated modulation transfer function measurement using an edge of a PMMA phantom: impact of field of view on spatial resolution of CT images. J Appl Clin Med Phys. 2018;19(6):244‐252. doi:10.1002/acm 2.12476 30338920 PMC 6236841 · doi ↗ · pubmed ↗