An assessment of skill erosion on high‐dose rate brachytherapy treatment planning
Dominic J. DiCostanzo, Theodore T. Allen, Ahmet S. Ayan, Allison Quick, Kevin D. Evans, Emily S. Patterson

TL;DR
This study shows that medical physicists who leave brachytherapy services experience a decline in planning skills over time.
Contribution
The study quantifies skill erosion in brachytherapy treatment planning and identifies variability contributors.
Findings
On-Service physicists had lower variability in 10 of 19 planning measures compared to Off-Service physicists.
Skill erosion was evident in physicists not engaged in brachytherapy for six months or more.
Only the bladder D2cc metric showed controlled variability in both groups.
Abstract
To assess skill erosion for physicists after leaving brachytherapy service and contributors to variability during brachytherapy treatment planning. Medical physicists simulated planning for nine patients by creating treatment plans twice, 2 months apart. The physicists were stratified as “On‐Service” if they were assigned to and actively participating in the brachytherapy service (including treatment planning) during the study or “Off‐Service” if they were not assigned, were not actively engaged in the service, and had not performed treatment planning the 6 months prior to the study commencement or during the study. A mixed effects model with Bonferroni correction was used to test for statistical significance between the stratified groups and Cohen's D was used to compare the effect of skill erosion. ANOVA analysis of a crossed Gauge Repeatability and Reproducibility (Gauge R&R) study…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
FIGURE 1
FIGURE 2
FIGURE 3
FIGURE 4| Age at Treatment | EBRT + Brachytherapy | ||
|---|---|---|---|
| Median [Range] (y) | 49 [29–87] | Yes | 9 |
| No | 0 | ||
|
| |||
| C53.1 ‐ Exocervix | 3 |
| |
| C53.9 ‐ Cervix uteri | 6 | Ring + 6 cm Tandem | 9 |
|
|
| ||
| IB | 3 | 550cGy | 8 |
| IIB | 3 | 600cGy | 1 |
| IIIB | 2 | ||
| IIIC | 1 |
|
|
| ||
| Median [Range] (y) | 34 [26–57] | MS | 8 |
| DMP | 2 | ||
|
| PhD | 2 | |
| M | 8 | ||
| F | 4 |
| |
| Median [Range] | 7.3 [1.5–19] | ||
|
| |||
| On‐Service | 5 |
| |
| Off‐Service | 7 | Yes | 10 |
| No | 2 | ||
|
| |||
| Yes | 12 |
| |
| No | 0 | Median [Range] (y) | 6 (0–15) |
|
|
| ||
| Yes | 6 | QMP | 10 |
| No | 6 | Resident | 2 |
| On‐Service | Percent Study Variance (%) (95% CI) | ||||
|---|---|---|---|---|---|
| Variance Components | Tandem First Point | Ring First Point | Tandem a0 | Tandem a1 | |
|
| 2.0 (1.0–3.6) | 1.1 (0.6–1.7) | 1.2 (0.6–2.8) | 3.3 (1.7–5.1) | |
|
| 1.8 (0.9–2.7) | 1.0 (0.5–1.6) | 0.9 (0.5–1.3) | 2.6 (1.4–4.2) | |
|
| 0.9 (0.4–2.9) | 0.3 (0.0–1.0) | 0.8 (0.4–2.6) | 1.9 (0.6–3.6) | |
|
| 100.0 (99.9–100.0) | 100.0 (100.0–100.0) | 100.0 (100.0–100.0) | 100.0 (99.9–100.0) | |
| On‐Service | Percent Study Variance (%) (95% CI) | ||||
|---|---|---|---|---|---|
| Variance Components | HR‐CTV D90% | Point A Left | Point A Right | Tandem Total Time | Ring Total Time |
|
| 65.8 (40.0–87.2) | 37.2 (20.1–58.7) | 33.9 (18.1–56.2) | 43.3 (23.9–67.9) | 57.5 (33.7–77.3) |
|
| 39.0 (22.5–53.5) | 21.3 (11.4–31.8) | 16.8 (8.9–53.1) | 37.4 (20.6–53.7) | 53.6 (31.1–70.6) |
|
| 53.0 (30.2–81.5) | 30.5 (15.8–54.0) | 29.4 (15.3–53.1) | 21.8 (9.2–57.5) | 20.9 (6.6–58.0) |
|
| 75.3 (49.0–91.7) | 92.8 (81.0–98.0) | 94.1 (82.8–98.3) | 90.2 (73.4–97.1) | 81.8 (63.5–94.2) |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Radiotherapy Techniques · Radiation Dose and Imaging · Prostate Cancer Diagnosis and Treatment
INTRODUCTION
1
Specialized skills require regular use to maintain proficiency. According to the American Association of Physicists in Medicine (AAPM), the frequency at which medical physicists perform routine duties can directly impact their competency.1 Evidence elucidates that cognitive skills decay faster than physical skills.2
The majority of tasks that qualified medical physicists (QMPs) perform require cognitive skills.3 In smaller practices, the exposure to certain tasks may be limited due to the patient population treated. There may also be a reduction in exposure at larger practices due to operational requirements such as limited rotations of staff or the necessary specialization of team members. Regardless of practice environment, unless there is concerted effort to interrupt skill decay, there is a strong probability that QMPs will suffer from its effects. The probability of skill loss increases for procedures that are less common. One such area that physicists may participate less frequently in is brachytherapy. The number of brachytherapy cases is decreasing.4 As cases decline, physicians and physicists will face increased skill erosion.
Treatment planning is a cognitive skill and one of the key activities within brachytherapy services that QMPs may realize skill decay. Regardless of the contributing factors, the impact of diminished skills on patients is direct. To combat this, the AAPM has generated a framework for a competency and credentialing program including a recommendation of observational competency assessment with feedback at least four times per year for brachytherapy procedures.1
To assess the impacts of skill erosion on brachytherapy treatment planning, we performed a Gauge Repeatability and Reproducibility (Gauge R&R) study with simulated brachytherapy treatment planning tasks. Repeatability refers to intra‐observer variability or the variation of the same individual performing the same task multiple times. Whereas reproducibility refers to inter‐observer variability or the variation between individuals performing the same task. Gauge R&R also quantifies the patient‐to‐patient variability to assess its impact on the variability of the process.
METHODS AND MATERIALS
2
Patient and participant identification
2.1
This study was approved by our Institutional Review Board. Nine patients previously treated with the ring and tandem (R&T) applicator were randomly selected; none of which were treated with a hybrid applicator (i.e., needles in addition to the rigid applicator) [Table 1]. The data for each patient was anonymized, the order randomized, and the patients were anonymized for a second time creating two series of nine patients for each participant. Twelve medical physicists (QMPs and medical physics residents) participated in the study [Table 2]. All physicists were deemed competent in brachytherapy treatment planning during their tenure at our institution. Each physicist had 18 datasets to generate treatment plans on and each dataset included CT datasets for treatment planning and applicator reconstruction, MRI datasets with clinical image registration, and clinical segmentations of the high‐risk CTV (HR‐CTV)5 and organs‐at‐risk (OARs).
Generation of treatment plans
2.2
As a simulated planning study, the participants were asked to follow our institutional guidelines for manual reconstruction of the applicator and were encouraged to review the guidelines and procedures if they desired. Participants were then asked to generate treatment plans for a single fraction with a 550cGy prescription dose irrespective of the treated clinical prescription.6 The plans created were to be of sufficient quality that the participants would ask for physician review and approval in a clinical setting. The plans were generated in Eclipse treatment planning system (Brachytherapy Treatment Planning v16.1) using the GammaMedplus afterloader, a step size of 5 mm, and nominal 10 Ci source strength.
Our institution uses volumetric‐based treatment planning following consensus recommendations.5, 6, 7, 8, 9 The dose metrics used for plan evaluation included the dose covering at least 90% of the HR‐CTV (D90%) and the minimum dose to the hottest 2cc for each OAR (D2cc). Each participant was given the first nine datasets in September 2022 and asked to generate the treatment plans. The second nine datasets were provided in November 2022.
Export of quantitative data
2.3
Dosimetric metrics were extracted via automated methods and stored as comma‐separated value (CSV) files for analysis. The DICOM Radiotherapy Plan (RTPlan) and Radiotherapy Structure Set (RTStruct) files were processed using Python. From the RTPlan file, the control points were extracted for each channel of the R&T applicator that defines the (X, Y, Z) coordinates of the dwell position and cumulative dwell time. In addition, the source dwell times were calculated from the RTPlan file. The total dwell time and the number of control points each participant used were also obtained. From the RTStruct file, the number of contour points used to reconstruct each channel of the R&T and the (X, Y, Z) coordinates of each point used in the reconstruction.
Polynomial fits
2.4
Unlike anatomical segmentations the reconstructed applicator cannot have metrics like the Dice coefficient or intersection over union calculated for comparison as they cannot be converted to segmentation masks. In addition, the number of points used by the participants varied for both the tandem and ring. As a result, the contours for the applicator were fit with a polynomial until the point after the final dwell position. (Details available in the supplemental information.) The fit coefficients were then evaluated using Gauge R&R to determine the variability in applicator reconstruction.
Data analysis
2.5
We used a crossed Gauge R&R design with analysis of variance (ANOVA). The planners were stratified into two groups, “On‐Service” if they were assigned to and actively participating in the brachytherapy service (including treatment planning) during the study or “Off‐Service” if they were not assigned, were not actively engaged in the service, and had not performed treatment planning the 6 months prior to the study commencement or during the study. The standard criterion for acceptability determination is the percentage of all variation resulting from total R&R (%GRR). The system is considered acceptable if the %GRR is less than 10%. If the %GRR is between 10% and 30%, it is considered marginally acceptable; if the %GRR exceeds 30%, it is considered unacceptable.10 We assessed the variation in planned dose to three OARs and HR‐CTV, dose to Manchester System reference points (A left, A right), Internation Commission on Radiation Units and Measurement Report 38 points (rectum, and bladder), dwell time of the applicator components (ring and tandem, respectively), and the reconstruction using the fit coefficients. Minitab statistical analysis software (Minitab, LLC, Coventry, UK, https://www.minitab.com/) was used to calculate the R&R for each data type using ANOVA.
The sum of squared differences for each plan, patient, planner, and repetition were computed directly from the data provided. The sum of squared differences for the patient by planner interactions was computed as:
where SStot is the sum of squared differences for each plan. The Gauge R&R value is computed using the calculated variance for repetitions and planners:
The controllable variation of the system is given by the repeatability (σ2rep) and reproducibility (σ2planner) components of %GRR. The uncontrollable variation is given by the patient‐to‐patient variation (σ2pat) calculated by:
where MSpat is the mean square of the patient, MSpat·planner is the mean square of the patient‐planner interaction, r is the number of repetitions, and p is the number of patients.
To evaluate statistical significance between the stratified groups, a mixed effects model (MEM) was created. As the plans were generated two times, the original plan number (i.e., 1–9) was assigned to the random factors, with trial (1 or 2) and planner service status (0 for Off‐Service and 1 for On‐Service) assigned to the fixed factors. The trial number was also nested within the original plan number to include repeatability in the evaluation of statistical significance.
Physician review
2.6
One (1) board‐certified physician with expertise in gynecological brachytherapy rated the plans using a 3‐point scale commonly employed in other studies: (1) clinically acceptable, would approve for treatment, (2) minor modifications necessary, would request minor adjustments before approving for treatment, (3) clinically unacceptable, state of the current plan is unsuitable for delivery and would require multiple changes. Due to the number of plans created by the planners and the time required for review, 4 plans were randomly selected and presented from each of the 12 planners (48 of 216 plans). The physician was blinded to the identity of the treatment planners and provided only the patient identification information to facilitate access to the plans in the TPS.
RESULTS
3
Across all evaluated metrics, the group of On‐Service study participants displayed less variation than the group of Off‐Service study participants [Figures 1 and 2]. Of the 19 evaluated metrics, 10 were statistically significant, indicating that those who were On‐Service performed planning with less variability than those who were Off‐Service. Cohen's D showed that each of the metrics with statistically significant variability had a small effect with values between 0.15 and 0.30. For the On‐Service group, the only metric that showed a controlled amount of variation was the bladder reference point (26.5 %GRR) other than the reconstruction of the applicator.
Clustered column of the metrics used to evaluate reconstruction of the tandem and ring by group. Brackets above each pair display the statistical significance of the difference between the groups for the given metric. Significance set to p‐value < 0.00263 (Bonferroni correction). The dashed line at 10% GRR is the value below which a metric is deemed to be acceptable by Gauge R&R. The 30% GRR line shows the line of acceptability. Tandem 1st Position and Ring 1st Position represent the variability in the first selected point of the applicator. The “a” values represent the variability of the fit coefficients for the polynomial fit of the applicator, which acts a surrogate for variability in the location of the applicator.
A clustered column of the dosimetry related attributes evaluated. Pt_A, Bladder Pt, and Rectum Pt represent the Manchester System points as inserted in the clinical treatment plan. The locations of the points from the treated plan were extracted and applied to the plans generated during the study. The dashed line at 10 %GRR is the value below which a metric is deemed to be acceptable by Gauge R&R. The 30 %GRR line shows the line of acceptability. Brackets above each pair display the statistical significance of the difference between the groups for the given metric. Significance set to p‐value < 0.00263 (Bonferroni correction).
Applicator reconstruction
3.1
The brachytherapy treatment planning system uses the first contour point as the applicator's physical tip. Any errors in identifying the tip will directly result in shifts to all planned dwell positions and can affect the planned dose distribution in all three dimensions (cranial‐caudal, left‐right, and anterior‐posterior). In Table 3, the results indicate that the first contour point for the tandem and ring are well controlled, and the MEM indicated that service status had no effect on its selection [Figure 1].
For the QMPs who are On‐Service, the digitization of the applicator parts is well‐controlled for the tandem (range of 1.2%–3.3% GRR) and marginally controlled for the ring (range of 1.0%–22.3% GRR). Those who were Off‐Service also reconstructed the tandem in a well‐controlled manner (range of 2.5%–3.6% GRR) [Table 3]. However, the ring reconstruction was not controlled (range of 1.7%–47.4% GRR). The MEM indicates that service status has no effect on tandem reconstruction but does impact ring reconstruction [Figure 1].
Dosimetry metrics
3.2
The dosimetric metrics and reference points [Figures 2, 3, and 4] show that plan creation is highly variable regardless of service status [Table 4]. Only the bladder reference point was marginally controlled for the On‐Service group; no metrics were controlled for Off‐Service. The MEM suggests Off‐Service physicists show significantly greater variability than those On‐Service, except for the bladder and rectum reference points and the rectum D2cc metric [Figure 2].
A boxplot of the dosimetric metrics evaluated with the differences between the Off‐Service and On‐Service groups present. The shaded box indicates the first through third quartiles with the line indicating the median. The individual marks above and below indicate outliers.
A boxplot of the reference point doses evaluated with the differences between the Off‐Service and On‐Service groups present. The shaded box indicates the first through third quartiles with the line indicating the median. The individual marks above and below indicate outliers.
Variability in bladder filling impacts the dose delivered to the patient. Compared to Off‐Service physicists, the On‐Service group produced more consistent plans with lower variability, particularly for the bladder D2cc metric. Greater variability can often require more plan modifications or iterations. Since bladder volume can change rapidly, efficient and precise planning is essential to ensure the imaging used for treatment accurately reflects patient anatomy.
Clinical acceptability
3.3
A review of the clinical acceptability as rated by the physician indicates that the physicists created treatment plans appropriate for treatment most of the time, 38/48 (79%). Only 13/48 (19%) of the plans generated would require minor modifications; almost exclusively related to changes in HR‐CTV coverage. One plan was deemed clinically unacceptable due to the sigmoid dose and over coverage of the HR‐CTV. The On‐Service group had 17/20 (85%) of the plans generated deemed clinically acceptable as opposed to 21/28 (75%) of the plans by the Off‐Service group.
DISCUSSION
4
In this study, we have identified that skill erosion does occur for medical physicists in brachytherapy treatment planning. Using the crossed Gauge R&R method with ANOVA analysis, variability in multiple aspects of the treatment planning process was isolated between physicists who were stratified between On‐Service and Off‐Service groups. The results show that physicists who are On‐Service had less controllable variability than those who were Off‐Service. Even when the controllable variability exceeded the GRR tolerance (30 %GRR) for those On‐Service, the variability encountered was significantly less (10 of 19 metrics) than that of those who were Off‐Service. To our knowledge, this is the first study to investigate skill erosion in radiation therapy treatment planning. At a time when artificial intelligence is gaining popularity across all domains, physicists must be aware of the dangers of skill erosion and learn to mitigate its effects.
Consequentially, it was identified that skill erosion could be a factor in overall plan quality. While almost all dosimetric metrics showed uncontrolled variability between planners, the variability of the On‐Service group was significantly lower than the variability of the Off‐Service group except for the rectum metric and rectum and bladder reference points. In addition, when reviewing the plan scores as performed by the physician, the On‐Service group had 85% (17/20) of their plans scored as clinical compared with 75% (21/28) of the Off‐Service group.
Although the dosimetry related metrics were largely uncontrolled and highly variable, applicator reconstruction showed mixed results. The first point of the applicator reconstruction is well‐controlled for both the ring and tandem. The identification of the first point of the ring and tandem is performed by determining the change in Hounsfield Units between soft tissue and the applicator using the line profile tool within the TPS. The tandem reconstruction was the only well‐controlled process between both groups. Given that the tandem reconstruction is trivial and can be often performed using two points, this is expected. The ring reconstruction was marginally controlled within the On‐Service group, as contrasted with the Off‐Service group where it is uncontrolled. While the reconstruction of the ring is more complicated than the tandem given the physical geometry, the reconstruction process is the same. This indicates that skill erosion could be a contributing factor in the increased variability.
The decreased controllable variability indicates that the brachytherapy treatment planning process can be positively impacted through the introduction of quality improvement measures. While the selection of the first point of the applicator is well controlled for the evaluated applicator, this may not be the case for other applicators with less identifiable features (e.g., needles). Extending this work to other applicators would provide a more comprehensive understanding of the treatment planning process. In addition, while statistically significant, the variability identified in the applicator reconstruction is not likely to be clinically significant. The identified uncertainty is relatively small compared with other uncertainties in brachytherapy treatment planning, such as segmentation or biological processes like internal organ motion. However, this should not preclude a reduction of uncertainty wherever possible.
The variability in the dosimetric metrics could lead to increased iterations during the planning process. If reduced, it could lead to meaningful positive impacts on patients’ clinical care. Specifically, the D2cc metric for the bladder was the only dosimetric metric that was below the acceptability cutoff of 30%. Given the biological process of bladder filling, the faster a plan is generated influences how representative the planning image is of the patient state at treatment. More generally, the physician often caps the number of iterations in brachytherapy as they balance the clinical benefit of the treatment against various risk factors, for example, time under anesthesia. Reducing variability in the initial plan quality presented to the physician for approval could have positive implications for the safety and efficiency of the brachytherapy treatment planning process. It is important to note that the physician would have approved most of the plans and deemed them clinically acceptable. As such, the variability of the plans generated could directly impact the patient via a reduction of dose to OARs in the initial plans presented to the physician. Reducing variability in this regard would require more specific directives and overall guidance provided by the physician than is currently provided at our institution.
The findings of this study emphasize the importance of competency evaluations for QMPs involved in brachytherapy treatment planning. The identified skill erosion can have direct implications for patient care. Patients who experience variations in dose delivery could suffer from suboptimal outcomes. Additionally, we identified that the increased variability associated with Off‐Service group required modifications leading to additional plan iterations, which unnecessarily lengthens the planning process. This not only impacts patient comfort and safety but also strains resources, especially in environments with limited resources.
This study was a simulated treatment planning study, and as such did not represent the clinical pressures often associated with planning. A limitation of this study is that the plans were only repeated twice. However, statistical significance via MEM was achieved suggesting that the findings would not change if we had repeated the plans for a third time. Due to convenience, this study sampled 12 physicists at a single institution, two of which were residents. However, the residents included had completed a brachytherapy rotation and been deemed competent to independently perform brachytherapy treatment planning limiting any potential impact on the results. Further work could look at a broader demographic across multiple institutions. Finally, an attempt to quantify the rate of skill erosion that was uncovered in this study would benefit the field by allowing interventions to be generated at specific times rather than arbitrarily identified points. The current evidence does support the use of competency‐based evaluations in medical physics practice. Practice or simulation prior to clinical work for a skill that has not been dormant could mitigate its effects as suggested in previous literature.11
CONCLUSION
5
Brachytherapy treatment planning performed by medical physicists suffers from skill erosion. Indicating that the frequency of performing brachytherapy cases significantly affects a medical physicist's competency in treatment planning. There exists both controllable and uncontrollable variability in our current processes, which would benefit from quality improvement initiatives.
AUTHOR CONTRIBUTIONS
Dominic DiCostanzo: Conceptualization; methodology; software; formal analysis; investigation; data curation; writing—original draft, visualization. Theodore Allen: Methodology; formal analysis; writing—review and editing; visualization; supervision. Ahmet Ayan: Methodology; data curation; writing—review and editing; supervision. Allison Quick: Methodology; writing—review and editing; supervision. Kevin Evans: Writing—review and editing; supervision. Emily Patterson: Conceptualization; methodology; formal analysis; writing—review and editing; visualization; supervision.
CONFLICT OF INTEREST STATEMENT
No, there is no duality of interest that I should disclose, having read the above statement.
ETHICS STATEMENT
This study was approved by our Institutional Review Board.
Supporting information
Supporting information
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Pavord DC , Birnbaum S , Boccuzzi D , et al. A report from the AAPM subcommittee on guidelines for competency evaluation for clinical medical physicists in radiation oncology. J Appl Clin Med Phys. 2016;17(4):3‐14. doi:10.1120/jacmp.v 17i 4.5804 PMC 569004127455473 · doi ↗ · pubmed ↗
- 2Arthur W Jr , Winston BJ , Stanush PL , Mc Nelly TL . Factors that influence skill decay and retention: a quantitative review and analysis. Human Performance. 1998;11(1):57‐101. doi:10.1207/s 15327043 hup 1101_3 · doi ↗
- 3American Association of Physicists in Medicine. Definition of A Qualified Medical Physicist. Accessed February 2, 2023, https://www.aapm.org/org/policies/details.asp?id=2551&type=PS
- 4Colson‐Fearon D , Han K , Roumeliotis MB , Viswanathan AN . Updated trends in cervical cancer brachytherapy utilization and disparities in the united states from 2004 to 2020. Int J Radiat Oncol Biol Phys. 2024;119(1):154‐162. doi:10.1016/j.ijrobp.2023.11.036 38040060 10.1016/j.ijrobp.2023.11.036 · doi ↗ · pubmed ↗
- 5Prescribing, recording, and reporting brachytherapy for cancer of the cervix. J ICRU. 2013;13(1‐2):NP. doi:10.1093/jicru/ndw 027 10.1093/jicru/ndw 02727335496 · doi ↗ · pubmed ↗
- 6Viswanathan AN , Beriwal S , De Los , Santos JF , et al. American brachytherapy society consensus guidelines for locally advanced carcinoma of the cervix. Part II: high‐dose‐rate brachytherapy. Brachytherapy. 2012;11(1):47‐52. doi:10.1016/j.brachy.2011.07.002 22265437 10.1016/j.brachy.2011.07.002PMC 3489267 · doi ↗ · pubmed ↗
- 7Viswanathan AN , Thomadsen B . American Brachytherapy Society Cervical Cancer Recommendations C, American Brachytherapy S. American Brachytherapy Society consensus guidelines for locally advanced carcinoma of the cervix. Part I: general principles. Brachytherapy. 2012;11(1):33‐46. doi:10.1016/j.brachy.2011.07.003 22265436 10.1016/j.brachy.2011.07.003 · doi ↗ · pubmed ↗
- 8Potter R , Haie‐Meder C , Van Limbergen E , et al. Recommendations from gynaecological (GYN) GEC ESTRO working group (II): concepts and terms in 3D image‐based treatment planning in cervix cancer brachytherapy‐3D dose volume parameters and aspects of 3D image‐based anatomy, radiation physics, radiobiology. Radiother Oncol. 2006;78(1):67‐77. doi:10.1016/j.radonc.2005.11.014 16403584 10.1016/j.radonc.2005.11.014 · doi ↗ · pubmed ↗
