Criterion-Related Validity and Reliability of a Measurement Tool for Medical Doctors’ Work-Related Quality of Life in Japan
Miyuki Ezura, Katsuhiko Sawada, Yusuke Takushima, Lida Teng, Ataru Igarashi

TL;DR
This study confirms the validity and reliability of a questionnaire measuring Japanese doctors' work-related quality of life.
Contribution
The study validates a revised questionnaire (WQMD-9) for assessing doctors' work-related quality of life in Japan.
Findings
The WQMD-9 questionnaire showed a strong correlation (0.7891) with a visual analogue scale, supporting its validity.
The questionnaire's reliability was confirmed with a Cronbach’s α of 0.87.
The WQMD-9 consists of nine dimensions and can be scored using a simple total scoring approach.
Abstract
Objective: This confirmatory survey aimed to verify the criterion-related validity and reliability of the final version of the Medical Doctors’ Work-Related Quality of Life Questionnaire (WQMD-9), following partial revision of its content. This study also explored the questionnaire’s structure and scoring methods. Method: From June to July 2022, the WQMD-9 was administered to 98 MDs selected to match the statistical distribution of MDs in Japan. Criterion-related validity was evaluated using a visual analogue scale (VAS) as the reference standard, and reliability was examined using inter-dimension correlations and Cronbach’s α. Results: The correlation coefficient between the VAS score and the simple sum of WQMD-9 dimensions scores was 0.7891, supporting criterion-related validity. Cronbach’s α was 0.87, indicating acceptable reliability. Conclusions: The profile-type WQMD-9 consists of…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3- —Otsuka Holdings
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealthcare professionals’ stress and burnout · Workplace Health and Well-being · Occupational Therapy Practice and Research
1. Introduction
The emergence of game-changing healthcare technologies in recent years has upended conventional approaches in medical device and pharmaceutical development [1,2]. These advancements are considered to deliver value not only to patients but also to healthcare providers by significantly influencing clinical practice [3,4]. The Medical Doctors’ Work-Related Quality of Life Questionnaire (WQMD-9) was developed to visualize these challenges among MDs (Figure S1). Unlike resource use metrics such as procedure duration or hospitalization period, WQMD-9 offers a subjective assessment of MDs-centered perspective. To begin with, we performed a systematic review of existing questionnaires that assess work-related quality of life (QOL) in the context of MDs [5].
From the final selection of 81 papers revealed that existing studies assessing MDs’ quality of life (QOL) typically employed a combination of original items and multiple standardized instruments, rather than a single unified questionnaire. For example, a study conducted in Denmark utilized the Maslach Burnout Inventory–Human Services Survey (MBI-HSS) to assess fatigue, the Warr–Cook–Wall Job Satisfaction Scale (WCW-JSS) for career satisfaction, the Perceived Stress Scale (PSS-10) for working environment, and both the Short-Form Health Survey (SF-12) and the World Health Organization Five Well-Being Index (WHO-5) for overall QOL, while work–life balance was evaluated using original items [6]. Although comprehensive assessment using multiple validated instruments is a valid approach, it presents challenges such as increased respondent burden and interpretive complexity when integrating results across different scales. In this context, the development of a single consolidated questionnaire offers significant advantages for both researchers and participants. We therefore decided to develop a new questionnaire, for which we defined the “work-related QOL of medical doctors” as factors related to a medical doctor’s work (such as clinical practice) that subjectively impact their own satisfaction, work, and life, and we constructed a profile-type QOL scale accordingly. In our previous research [7], we developed the WQMD-9 which is an original profile-type scale consisting of nine dimensions and five levels. We followed a standard methodology for the development of a QOL scale, beginning with qualitative interviews of 20 MDs, then validation interviews with 8 MDs, and finally a quantitative survey with 374 MDs to evaluate its validity, inter-dimension correlations, and reliability. According to this survey’s results, the response wording was modified, and a temporal element was added to the beginning of each question to avoid ceiling effects. The WQMD-9 was then finalized after consultation with external experts. Notably, the WQMD-9 is the first tool developed in Japan by our research team to assess the work related QOL for MDs [8].
In this study, to examine the characteristics of the profile-type scale, we aimed to verify the criterion-related validity and reliability of the finalized WQMD-9—which has already been validated for internal consistency, content validity, construct validity, and interpretability [9]—and to conduct an explorative analysis of its construct validity through factor analysis of the questionnaire’s structure (dimensions) and consideration of scoring methods.
2. Materials and Methods
A confirmatory survey using the WQMD-9 was conducted with 98 MDs [10] recruited from an MD registry maintained by a research company, according to criteria based on the distribution proportions published by the Ministry of Health, Labour and Welfare in the 2018 National Medical Doctor Statistics [11]: hospital MDs (67%)/private practitioners (33%); internal medicine specialties (51%)/surgical specialties (29%)/others (19%); male (79%)/female (21%); and age groups 29 years and under (9%), 30–39 years (21%), 40–49 years (22%), 50–59 years (22%), 60–69 years (17%), and 70 years and above (10%). Recruitment was conducted via the internet, with enrollment for each stratum closed once the target number was reached. To evaluate criterion-related validity, we added a question on “overall work-related QOL” using a visual analogue scale (VAS) [12]. A prior systematic review [5] found that there was no single questionnaire designed to measure work-related QOL among MDs; thus, the VAS was employed for the purpose of measuring overall QOL. To examine internal consistency, we assessed inter-dimension correlations and calculated Cronbach’s α coefficient. In addition, to reconfirm the questionnaire structure identified in the previously conducted quantitative survey [7], we conducted exploratory factor analysis and examined scoring methods. However, as we do not currently intend to use the two identified factors (dimensions) separately, we also performed exploratory analyses in our survey with 98 MDs. Data analysis was then performed using JMP Version 16 statistical software (JMP Statistical Discovery LLC, Cary, NC, USA).
Following completion of both the confirmatory survey and exploratory analyses, an external expert panel review was conducted to finalize the WQMD-9.
This survey was exempt from ethics committee review, as it focused on instrument development and validation rather than collecting data on individual behaviours or perceptions. Prior to participation and via an online platform, all individuals were presented with a detailed outline of the purpose of the study, the voluntary nature of their involvement, the restricted use of data (i.e., data would be utilized exclusively for scale development), and the complete anonymization of all responses. Informed consent was obtained electronically and documented appropriately.
3. Results
The results of the confirmatory survey conducted with 98 MDs between June and July 2022 are presented below. For clarity, we refer to the questionnaire version tested in the initial quantitative survey of 374 MDs during the first stage of scale development as “pre-WQMD-9,” while the revised version evaluated in this confirmatory study is referred to as “WQMD-9.”
3.1. Criterion-Related Validity
Figure 1 shows the distribution of responses to the VAS (Q10), which was administered along with the nine-dimension WQMD-9 to examine criterion-related validity. The mean VAS score was 64.7 (median: 70.0), and values ranged from a minimum of 15 to a maximum of 100. The correlation coefficients (Spearman’s rho) between the VAS and the nine constituents’ dimensions of the WQMD-9 were calculated to assess criterion-related validity. Correlations were generally strong to moderate in strength (r = −0.46 to −0.79), although the correlation with Q3 (“Collaboration”) was notably weaker than the others (r = −0.25; see Table 1).
3.2. Reliability
Table 1 presents the inter-dimension correlation coefficients, which were calculated to assess the internal consistency of the questionnaire. The strongest correlation was observed between Q5 (“Working conditions”) and Q6 (“Working environment”), with a coefficient of 0.8324. The correlations between Q3 (“Collaboration”) and other dimensions were generally the weakest, with coefficients as low as 0.162. Cronbach’s α coefficient calculated from the responses to dimensions Q1–Q9 was 0.899, indicating the WQMD-9′s good internal consistency.
3.3. Resolution of Ceiling Effects
The response wording for Q4 (“Clinical practice”) was modified for the confirmatory survey, as the response distribution exhibited a problematic ceiling effect in the quantitative survey conducted during the first stage of scale development. Figure 2 shows a mosaic plot of the responses to Q1–9. For Q4, the confirmatory survey’s results showed increased responses for “3” and “4,” indicating that the ceiling effect was successfully resolved. The responses to the other question dimensions were generally similar to those of the initial pre-WQMD-9 survey, with no problematic ceiling effects identified.
3.4. Exploratory Analysis
3.4.1. Factor Analysis
The eigenvalues, corresponding factor contribution rates, and cumulative plots are shown in Figure 3. The exploratory analysis suggested that one to two factors would be appropriate.
Table 2 shows the factor loadings estimated by the maximum likelihood method when specifying a single-factor solution.
Table 3 shows the factor loadings estimated by the maximum likelihood method with Promax oblique rotation when exploring a two-factor solution. The results suggest that it may be possible to separate dimensions into a factor representing “labor” (Q5 and Q6) and a factor representing “work” (Q1 and Q2), subsequently referred to as Factor A and Factor B, respectively. This analysis showed similarities with the exploratory factor analyses conducted in our previous survey [6]; however, some dimensions overlapped between factors and could not be categorized into one (e.g., Q3, Q4, Q7 or Q8).
3.4.2. Scoring Method Comparison
Candidate scoring methods were compared. Table 4 displays the results of the factor analysis using standardized scores. Responses were calculated as equidistant ordinal scales for this analysis.
Table 5 presents the correlation coefficients between the QOL VAS, simple sum values, and factor scores derived from the one- and two-factor models. Correlations were estimated using listwise deletion. Within the two-factor model, Factor A, which represents dimensions related to “labor”, showed higher correlation with the QOL VAS.
3.5. Summary of Results
While VAS is not a standardized reference measure, the criterion-related validity of the WQMD-9 can be considered confirmed with relation to work-related QOL VAS, as provides supportive evidence for the correlation (r = 0.7891) between the simple sum value and VAS score. The results of the scoring method comparison further indicate the suitability of unweighted simple summation without the need for additional item weighting, as demonstrated by the similar correlation coefficients between the VAS and simple sum (r = 0.7891) and between the VAS and the one-factor solution (r = 0.7824).
The results presented confirm the robust psychometric properties of the revised version of the WQMD-9 tested in the confirmatory survey, with no ceiling effects in the response distributions. Furthermore, the scale demonstrated reliability in the study population, who were demographically matched with the broader population of Japanese MDs (the intended users of this questionnaire), as evidenced by Cronbach’s α coefficient of 0.899.
Following expert panel review, the WQMD-9 was finalized as a nine-dimension instrument, with each dimension measured on five levels: “Workload,” “Working hours,” “Collaboration,” “Clinical practice,” “Working conditions,” “Working environment,” “Psychology,” “Work-life balance,” and “Career” ([7] and Figure S1 is in Japanese as originally written).
4. Discussion
This study was conducted to confirm the validity of a prior quantitative survey conducted with 374 MDs, while this survey was carried out with 98 MDs to evaluate criterion-related validity and assess the impact of differences in study populations on reliability. While a sample of 98 MDs is not statistically adequate for a rigorous standalone study, it exceeds the minimum threshold of ten times the number of questionnaires dimensions (i.e., 10 × 9 dimensions = 90) [13]. Therefore, as a confirmatory survey supporting a larger-scale survey of 374 MDs, we consider it sufficient. We compared the results with those of a previous quantitative survey [7] because, although the two study populations had similar attributes and distributions, they were not identical.
Regarding internal consistency, the inter-dimension correlation coefficients ranged from 0.162 to 0.832 in the confirmatory survey cohort (WQMD-9), compared with 0.328 to 0.699 in the quantitative survey cohort (pre-WQMD-9). The weakest correlation in the confirmatory survey was between “Collaboration” and “Career” (0.162), similarly to the quantitative survey (0.328). The highest correlation in the quantitative survey was between “Workload” and “Working hours” at 0.699 (0.694 in the confirmatory survey), while in the confirmatory survey, the highest correlation was between “Working environment” and “Working conditions” at 0.832 (0.468 in the quantitative survey). The temporal element added to the questions in the confirmatory survey may have influenced the correlation between “Working environment” and “Working conditions.” However, the correlations of Q4 “Clinical practice” with the other dimensions remained almost identical between the quantitative and confirmatory survey groups, suggesting that changing the response format did not affect reliability. These results suggest that the differences in study populations did not affect the reliability of the WQMD-9. Regarding criterion-related validity, our systematic literature review [5] revealed that no existing tools exclusively measure work-related QOL among MDs. Therefore, we employed a visual analogue scale (VAS) to assess overall work-related QOL. However, while the VAS provides a simple and intuitive means of capturing subjective evaluations, it lacks the psychometric rigour and dimensional specificity, and theoretical grounding necessary for rigorous validation [14,15]. Its use in this context represents a methodological compromise by the absence of more suitable tool, and the resulting findings should be interpreted with considerable caution. In the quantitative survey, the validity of individual dimensions was assessed by examining the relationships between each WQMD-9 dimension and the corresponding original dimensions [7]. However, to develop a more robust and clinically applicable instrument, it will be necessary to evaluate each component separately using appropriate standardized measures. For instance, the WHOQOL [16] may be used to assess overall quality of life, the Maslach Burnout Inventory [17] can be applied as a validated tool for measuring burnout, the Karasek Job Content Questionnaire (JCQ) [18] is applicable for evaluating work environment factors, and the Effort–Reward Imbalance (ERI) Questionnaire [19] is suitable for assessing occupational stress related to perceived effort and reward. Further validation of the WQMD-9 using these established instruments is warranted to ensure its reliability and domain-specific accuracy. While the WQMD-9 still requires further validation, particularly at the dimension level, its creation represents a meaningful first step toward establishing a unified tool for assessing work-related QOL among medical doctors.
Next, we present a discussion of the exploratory factor analysis and scoring procedures, noting that the results are preliminary and should be considered tentative. Based on the results of the exploratory factor analysis performed during the development of the Pre-WQMD-9, we anticipated that it may be possible to separate the questionnaire dimensions into two factors: one factor with high loadings for Q1 “Workload” and Q2 “Working hours” and a second factor with high loadings for Q6 “Working environment”, Q5 “Working conditions”, Q8 “Work-life balance”, and Q9 “Career”. In this survey, the exploratory two-factor analysis showed relatively similar results to that of a survey for pre-WQMD-9, despite some modest differences in the factor loadings of some questionnaires. Considering the limitations of these exploratory analyses, a prospective survey for confirmatory factor analysis will be very important for drawing meaningful conclusions, as well as addressing the observed cross-loadings, if our hypothesis of two factors is to be used for this instrument in the future.
Furthermore, Q3 “Collaboration” showed the least correlation with the VAS and the lowest contribution to the factors in the exploratory factor analysis. We did not exclude Q3 from the WQMD-9 questionnaire due to the nature of these exploratory analyses; however, the content of Q3 may require further scrutiny. The scoring method comparison in this survey suggested that unweighted simple sum scores may be sufficient for practical applications, such as internal comparisons between interventions or timepoints, although more complex weighted scoring methods may have advantages for the external validity of these scores. Future research should focus on accumulating additional data through expanded prospective confirmatory surveys to further refine and validate the WQMD-9.
Finally, the WQMD-9 was developed for Japanese MDs and created in Japanese; as English validation has not been conducted, we address here the issue of generalizability. As with other healthcare workers, the components of MDs’ work-related QOL may vary depending on the healthcare system. In Japan [19], medical fees are the same regardless of age, experience, or skill level, and the system follows a social insurance model with free access. In the United Kingdom [19], most MDs are civil servants under a general practitioner (primary care) system funded through taxation. In the United States [20], apart from Medicare and Medicaid, the system is largely private, with MDs’ compensation determined, to a large extent, at the discretion of the provider. Even among these three countries, the differences in healthcare systems are substantial, and their influence on QOL remains unclear. Nevertheless, if the “definition of a MDs” [21] under the World Medical Association’s Declaration of Genevais consistent across countries, the elements constituting QOL would be expected to remain largely unchanged. However, if environmental differences are found to influence these elements, it may be necessary to consider approaches such as applying score weighting or developing an index-type questionnaire that allows for international comparisons.
5. Conclusions
This study demonstrated the criterion-related validity of the WQMD-9, a tool for measuring work-related QOL among MDs, and confirmed its reliability, with no notable differences with the results of a previous quantitative survey. However, the fact that criterion-related validity was examined using a VAS is an acknowledged limitation: future studies should also verify validity using established questionnaires that are suitable for each dimension. The profile-type WQMD-9 was specifically designed for Japanese MDs: by enabling more precise visualization of the benefits for this demographic, which has often been overlooked, it represents a highly meaningful achievement as a new evaluation metric.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Sounderajah V. Patel V. Varatharajan L. Harling L. Normahani P. Symons J. Barlow J. Darzi A. Ashrafian H. Are Disruptive Innovations Recognised in the Healthcare Literature? A Systematic Review BMJ Innov.2021720821610.1136/bmjinnov-2020-00042433489312 PMC 7802637 · doi ↗ · pubmed ↗
- 2Haug C.J. Drazen J.M. Artificial Intelligence and Machine Learning in Clinical Medicine, 2023 N. Engl. J. Med.20233881201120810.1056/NEJ Mra 230203836988595 · doi ↗ · pubmed ↗
- 3Abràmoff M.D. Lavin P.T. Birch M. Shah N. Folk J.C. Pivotal Trial of an Autonomous AI-Based Diagnostic System for Detection of Diabetic Retinopathy in Primary Care Offices NPJ Digit. Med.201813910.1038/s 41746-018-0040-631304320 PMC 6550188 · doi ↗ · pubmed ↗
- 4Lång K. Josefsson V. Larsson A.M. Larsson S. Högberg C. Sartor H. Hofvind S. Andersson I. Rosso A. Artificial Intelligence-Supported Screen Reading versus Standard Double Reading in the Mammography Screening with Artificial Intelligence Trial (MASAI): A Clinical Safety Analysis of a Randomised, Controlled, Non-Inferiority, Single-Blinded, Screening Accuracy Study Lancet Oncol.20232493694410.1016/S 1470-2045(23)00298-X 37541274 · doi ↗ · pubmed ↗
- 5Ezura M. Sawada K. Takushima Y. Igarashi A. Teng L. A Systematic Review of the Characteristics of Data Assessment Tools to Measure Medical Doctors’ Work-Related Quality of Life J. Mark. Access Health Policy 202311223413910.1080/20016689.2023.223413937496728 PMC 10367570 · doi ↗ · pubmed ↗
- 6Nørøxe K.B. Pedersen A.F. Bro F. Vedsted P. Mental well-being and job satisfaction among general practitioners: A nationwide cross-sectional survey in Denmark BMC Fam. Pract.20181913010.1186/s 12875-018-0809-330055571 PMC 6064618 · doi ↗ · pubmed ↗
- 7Ezura M. Sawada K. Takushima Y. Igarashi A. Teng L. Development of a Work-Related Quality of Life Questionnaire for Medical Doctors (WQMD-9) in Japan: Questionnaire Design and Quantitative Survey J. Mark. Access Health Policy 2025134110.3390/jmahp 1303004140860952 PMC 12371978 · doi ↗ · pubmed ↗
- 8Ezura M. Sawada K. Takushima Y. Teng L. Igarashi A. Development of a tool to measure work-related QOL for physicians J. Health Welf. Stat.2024712733(In Japanese)
