The patient-doctor relationship questionnaire (PDRQ-9). An overview of 20 years of research and a proposal for normalisation of scores. Systematic review
Christina Maria Van Der Feltz-Cornelis, Edwin de Beurs

TL;DR
This paper reviews 20 years of research on the PDRQ-9 questionnaire, which measures patient-doctor relationship quality, and suggests a new way to standardize its scores.
Contribution
The paper proposes a novel normalization method for PDRQ-9 scores and provides validated cut-off thresholds for relationship quality.
Findings
The PDRQ-9 has a one-dimensional structure confirmed in most studies across 24 countries.
A T-score normalization method is proposed with cut-off values for challenged, average, and good patient-doctor relationships.
The questionnaire shows strong validity, correlating with patient understanding, shared decision-making, and treatment adherence.
Abstract
The Patient-Doctor Relationship Questionnaire (PDRQ-9) was developed as the first questionnaire to establish the quality of the Patient-Doctor Relationship (PDR) from the patient's perspective. It was published in 2004, has been translated and psychometrically tested in many languages, and is used widely. This study aims to explore the reliability and validity of the PDRQ-9 in various languages and countries, to report norm scores and cut-off scores for an above or below-average relationship, and to propose a novel scoring method for the PDRQ-9 based on the findings. Eligibility criteria: studies should report on the PDRQ-9 item version for inclusion. Studies were excluded if they were not peer-reviewed, did not provide outcome data or analysis for the PDRQ-9, did not use the recommended scoring method for nine items, developed another questionnaire based on the PDRQ-9, were a…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3| Study/Country | Study Population/Clinician | Factor structure indices |
|---|---|---|
| Van der Feltz-Cornelis et al. ( | 110 primary care patients in 5 Practices and 55 epilepsy inpatients in an specialised epilepsy clinic. | Principal component analysis with Varimax rotation of the 15-item version showed two factors. The Eigen value of the first factor was 7.509. This accounted for 58% explained variance. The Eigen value of the second factor was 1.198. This accounted for an additional 9%. The second factor and one item were eliminated, resulting in the nine item version PDRQ-9. |
| Porcerelli et al. ( | 180 adult primary care patients. | A principal components factor analysis with orthogonal varimax rotation was conducted. The results support a single factor solution with an eigenvalue of 7.04 accounting for 78.23% of the variance. No other component reached an eigenvalue of 1.0. Table 2 displays the distributions of each study variable. The mean values for men and women on both patient-rated and physician-rated variables did not differ statistically. Skewness and kurtosis values indicate that the study variables are relatively normally distributed. |
| Aloba et al. ( | 309 outpatients with stable schizophrenia. | Confirmatory factor analysis confirmed the unidimensionality of the factor structure of the PDRQ-9. A one factor model of the PDRQ-9 had satisfactory goodness of fit indices on Confirmatory Factor Analysis (χ2/DF = 1.72; |
| Martín-Fernández et al. ( | 451 patients from six primary care centres in Madrid. | Factor analysis found that a single factor explained 75.3% of the variance and showed very high loadings of all the items, except for the 2nd item. |
| Calderón et al. ( | 560 patients at 15 tertiary referral hospitals in Spain with colorectal (44.3%) or breast (34.3%) cancer. | Confirmatory factor analysis was conducted. Given that the item scores were ordered-categorical, with strongly negative skewed distributions and the sample size was large, as most suitable factor-analysis procedure a non-linear item factor analysis model based on an underlying-variables approach was chosen, and a polychoric correlation matrix was used. The unidimensional model fitted the data well and the values of the standard indices displayed good fit in all cases: SRMS = .02; RMSEA = .05, and CFI = .99. Furthermore, the H value was.98, signifying that the factor was well defined by the 9 items and that the solution was strong and replicable. Finally, the ECV value was.98, indicating that 98% of the common variance of the item scores could be explained by the fitted single factor. So, the solution can be considered as essentially unidimensional. |
| Fernández Castillo et al. ( | 305 patients with T2DM presenting at a general hospital. | A confirmatory factor analysis, using Maximum Likelihood method provided evidence of a good fit of the hypothesized one-factor model. All items presented an acceptable measurement weighs ( |
| Zenger et al. ( | 2,275 persons aged ≥14 who reported consulting with a primary care physician (PCP) in a representative cross-sectional German population survey. | Confirmatory factor analysis demonstrated that the hypothesized unidimensional structure of the PDRQ-9 fit the data very well [χ2 (27) = 345.860; χMIN/df = 12.810; |
| Arafat ( | 50 participants older than 14 years of the Psychiatry Out-Patient Department (OPD) of Dhaka Medical College Hospital (DMCH). | Construct validity was assessed by factor analysis. It showed high commonalities between the items before and after extraction ranging from 0.79 to 0.85. The varimax rotation showed only one component was extracted. The corrected item-total correlation was |
| Wollman et al. ( | 133 adult users of a Primary Health Care Service in Porto Alegre, State of Rio Grande do Sul, Brazil. | Confirmatory factor analysis: All items presented factor loading > 0.5 in the factor analysis. The cross-cultural adaptation of the PDRQ-9 in Brazil replicated the factorial structure found in the original study, with high internal consistency. |
| Wang Y et al. ( | 203 GH inpatients of neurology, gastroenterology, obstetrics and gynecology ward of Peking Union Medical College Hospital, Beijing. | Structural validity: The sample was randomly split in half to perform exploratory factor analysis (EFA) and confirmatory factor analysis (CFA), with IBM SPSS 20.0 and AMOS 27 respectively. Before the EFA was conducted, data suitability and sampling adequacy were checked using the Kaiser-Meyer-Olkin (KMO) value and Bartlett's test of sphericity. During the principal components analysis, factors with an eigenvalue larger than 1 were extracted. A total factor loading of more than 60% was considered as acceptable. Secondly, a confirmatory factor analysis [estimation method = diagonal weighted least square] was carried out. |
| Li et al. ( | 505 outpatients of internet hospitals and physical hospitals in Wuhan. | χ2/df = 3.799, RMR = 0.020, RMSEA = 0.092, GFI = 0.935, CFI = 0.958, TLI = 0.944 and NFI = 0.949 in this study, which supports a one factor model. |
| Maleki et al. ( | 208 patients in primary health centres in Tehran. | Construct validity was evaluated with an Exploratory Factor and a Principle Component Analysis accompanied by a varimax rotation. |
| Deniz et al. ( | 399 Patients from the province of Yalova, Turkey. | Factor loadings range from 0.727 to 0.857. |
| Bener et al. ( | 1,031 inpatients from psychiatric clinics in Istanbul hospitals clinically diagnosed with schizophrenia ( | The exploratory and confirmatory factor analyses confirmed validity and psychometric characteristics of the Turkish version of the 9-item Patient-Doctor Relationship Questionnaire (PDRQ-9) as follows. |
| Study/Country | Study Population/Clinician | Language Translation method | Reliability (internal consistency, test-hertest, or other) | Any validity indices |
|---|---|---|---|---|
| Dutch version | ||||
| Van der Feltz-Cornelis et al. ( | 110 primary care patients in 5 Practices and 55 epilepsy inpatients in an specialised epilepsy clinic. | The questionnaire was developed in Dutch and published in English after forward and backwards translation and independent evaluation by two native speakers. English version was provided. | Internal consistency: Cronbach's | Construct – discriminant validity was supported: The mean PDRQ-9 score was 4.06 (SD.77) in the primary care setting and 3.60 (SD 1.09) in the epilepsy clinic, a significant difference (mean.46, p.001). |
| Hoytema van Konijnenburg et al. ( | 82 parents who underwent the “Amsterdam Protocol” at an outpatient paediatric department. | The primarily used Dutch version ( | Internal consistency: Cronbach's α = .89 | |
| Afaan Oromo Version | ||||
| Biyazin et al. ( | 372 surgical patients at Jimma Medical Center. | Forwards-backwards translation method. | Internal consistency: Cronbach's α = .86 | |
| Arabic version | ||||
| Hegazy et al. ( | 434 Patients in outpatient family health clinics in the Suez Governorate, Egypt. | “The adaptation to Arabic was performed according to the state-of-the-art procedure of forward-backward translation by two medical doctors and one English-Arabic bilingual translator. Two forward translations into Arabic were independently completed by 2 medical doctors, both native speakers of the Arabic language and fluent in English. The 2 Arabic versions were compared, and an updated Arabic forward version was compiled. This version was translated back into English by a professional translator with experience in medical translation. This translator had not been involved in the forward translation. The primarily published version and the back-translated version – both in English – were compared by the medical doctor and the expert translator. Thus, an optimized Arabic version was generated. Additionally, in a final reconciliation process, the final Arabic version (PDRQ-9 Arabic) was generated and approved by all parties. Preliminary changes to the original questionnaire were made after piloting the tools and were reviewed.” | Internal consistency: | |
| Bangla/Bengali version | ||||
| Arafat ( | 50 participants older than 14 years of the Psychiatry Out-Patient Department (OPD) of Dhaka Medical College Hospital (DMCH). | Bengali version | Internal consistency: Cronbach's α = .97. | |
| Chinese version | ||||
| Wang Y et al. ( | 203 GH inpatients of neurology, gastroenterology, obstetrics and gynecology, ward of Peking Union Medical College Hospital, Beijing. | “The questionnaire was translated using a forward-backward translation method including steps of initial translation, synthesis of the translations, back translation, expert committee review, and test of the prefinal version. Five psychiatrists first translated the questionnaire into the Chinese version from the English version and then synthesized into the initial version. Next, this version was back translated into English by a bilingual expert, and then compared with the original English version to identify any discrepancy in meaning. | Reliability analysis: Cronbach's a was used to evaluate the internal consistency of the Chinese version of PDRQ-9 and its subscales. The Pearson correlation coefficient between the first test and the retest was calculated to access the 7-day test-retest reliability. | Construct validity: |
| Li et al. ( | 505 outpatients of internet hospitals and physical hospitals in Wuhan. | Forward-backward translation. | Internal consistency: Cronbach's α = .921. | . |
| English version | ||||
| Egeli & MacMillan ( | 190 patients with Fibromyalgia | The primarily published English version ( | Internal consistency: Cronbach's α = .97. | |
| Porcerelli et al. ( | 180 adult primary care patients. | The primarily published English version ( | Internal consistency: Cronbach's α = .96. | Construct validity: PDRQ-9 ratings were higher from patients who were treated by faculty physicians than from those who were treated by residents ( |
| Aloba et al. ( | 309 outpatients with stable schizophrenia. | The primarily published English version ( | Internal consistency: Cronbach's α = .92. | Construct validity: was supported by modest positive correlations with the Trust in physician scale ( |
| German version | ||||
| Zenger et al. ( | 2,275 persons aged ≥14 who reported consulting with a primary care physician (PCP) in a representative cross-sectional German population survey. | German | Internal consistency: Cronbach's α of the total score = .95. All corrected item-total correlations were ≥.94. | Construct validity was supported: The Spearman rank correlation between the mean satisfaction index and the satisfaction with pain treatment of 489 participants who reported chronic pain and pain treatment was |
| Hebrew version | ||||
| Zolotov et al. ( | Ninety-five patients who were prescribed medical cannabis for cancer, chronic pain or other and received a licence from the Medical Cannabis Unit of the Israeli Ministry of Health (MOH). | Hebrew version. | Internal consistency: Cronbach's α = .803. | Construct validity: With demographic variables controlled, PDRQ-9 score was found to be significantly predicting adherence ( |
| Malay version | ||||
| Johny et al. ( | 906 Patients who have reported using traditional and complementary medicine in Sarawak, Malaysia | Backward translation method with a pilot study for validity. | Internal consistency: Cronbach's α ranged from.617 to.948, indicating adequate internal consistency and reliability. | Content validity: Two biostatisticians expert in the field examined the content validity of the instrument. |
| Maldivian version | ||||
| Hassan et al. ( | 172 patients with essential hypertension. | Forward-backward translation process. | Internal consistency: Cronbach's α = .89 in a pilot study with 30 patients. | Content validity: Face validity, by subjective assessment of whether the questionnaire |
| Persian Version | ||||
| Maleki et al. ( | 208 patients in primary health centres in Tehran. | Forward-backward translation of the English version. | Internal consistency: Cronbach's α ≥ .94. | Content validity: Face validity had an acceptable impact score. Each question had >1.5 impact value. |
| Brazilian (Portuguese) version | ||||
| Wollman et al. ( | 133 adult users of a Primary Health Care Service in Porto Alegre, State of Rio Grande do Sul, Brazil. | Portuguese version. | Internal consistency: Cronbach's α = .94 in the self-report version. | |
| Spanish version | ||||
| Martín-Fernández et al. ( | 451 patients from six primary care centres in Madrid. | Spanish 9 item version, based on Mingote 15 item translation ( | Internal consistency: Cronbach's α = .952. | Construct validity was supported: Age (OR 1.03, 95% CI: 1.02–1.05) was associated with above average PDR index. |
| Calderón et al. ( | 560 patients at 15 tertiary referral hospitals in Spain with colorectal (44.3%) or breast (34.3%) cancer. | Spanish | Internal consistency: Cronbach's α = .97. | Construct validity was supported: Evidence of convergent validity was supported by modest positive correlations with functional ( |
| Fernández Castillo et al. ( | 305 patients with T2DM presenting at a general hospital. | Spanish – Mexican version. | Internal consistency: Cronbach's α = .95. | Construct validity was supported: “This study suggests that this instrument is a valid instrument useful for measuring doctor-patient relationship in patients diagnosed with Type 2 Diabetes Mellitus.” |
| Eiroa-Orosa et al. ( | 1,562 patients in three primary care centres with gastroenterology services in Barcelona. Convenience sample. | The psychometrically tested Spanish version ( | Internal consistency: Cronbach's α = .946 | |
| Thai version | ||||
| Pitanupong et al. ( | 264 University Hospital psychiatry outpatients with depression. | Forward translation from the original English version to the Thai version. | Internal consistency: Cronbach's α = .95. | Content validity of the translated version was evaluated by five psychiatrists, resulting in a CVI score of 0.8. |
| Praha et al. ( | 159 patients with Continuous ambulatory peritoneal dialysis in Thailand. | The psychometrically tested Thai version ( | Internal consistency: Cronbach's α = .95. | |
| Sangngam et al. ( | 240 patients with asthma visiting tertiary hospitals in Thailand. | Back-translation from the original English questionnaire. | Internal consistency: Cronbach's α = .93 | |
| Turkish version | ||||
| Deniz et al. ( | 399 Patients from the province of Yalova, Turkey. | The PDRQ-9 scale that was developed in the original study ( | Internal consistency: Cronbach's α = .932 | Construct validity: “We analyzed the collected data with Smart PLS3 software and performed validity studies and other analyses.” |
| Study | Patient setting | N | M | SD |
|---|---|---|---|---|
|
|
| |||
| Van der Feltz-Cornelis et al. ( | Primary Care patients | 110 | 36.54 | 6.93 |
| Epilepsy clinic inpatients | 55 | 32.40 | 9.81 | |
| Van der Feltz-Cornelis et al. ( | Primary Care patients with somatoform disorders | 81 | 27.67 | 5.98 |
| Du Long et al. ( | Thirty-three orthopaedic surgeons and 172 patients with either knee or hip osteoarthritis in Slotervaart Hospital, Amsterdam. | 172 | Not reported | Not reported |
| Hoytema van Konijnenburg et al. ( | Parents who underwent the “Amsterdam Protocol” at an outpatient paediatric department. | 82 | 38.88 | 6.12 |
| Metz et al. ( | SDM | 94 | 28.35 | 10.35 |
| Control | 106 | 27.18 | 11.88 | |
| Hanssen et al. ( | Primary care patients >60 years with chronic MUS | 107 | 36.84 | 7.98 |
| Primary care patients >60 years with chronic medically explained symptoms | 150 | 39.86 | 7.14 | |
|
|
|
| ||
|
| ||||
|
|
| |||
| Georgopoulou et al. ( | Hospital outpatients with lupus nephritis. | 98 | 36.00 | 8.19 |
| Porcerelli et al. ( | Visiting patients from a suburban university-based primary care clinic. | 180 | 40.67 | 6.64 |
| Aloba et al. ( | Specialty mental health outpatients with stable schizophrenia. | 309 | 32.40 | 7.09 |
| Hageman et al. ( | Orthopaedic Hand and Upper Extremity Service hospital outpatients with carpal tunnel syndrome | 84 | 37.80 | 7.83 |
| Hageman et al. ( | Orthopaedic Hand and Upper Extremity Service hospital outpatients with trigger finger | 84 | 39.60 | 6.03 |
| Gonzalez et al. ( | Patients from five orthopaedic specialist offices in urban areas. | 102 | 40.83 | 7.35 |
| Magidson et al. ( | Patients undergoing HIV treatment in substance abuse centre. | 77 | 39.16 | 7.43 |
| Kirby et al. ( | Parent child pairs, child age 1–21 years, in urban primary care practice in Baltimore. | 218 | 39.8 | 7.30 |
| Alomran et al. ( | Type 2 diabetes patients attending primary health-care centers of Al-Ahsa region in Saudi Arabia. | 366 | Not reported | Not reported |
| Din et al. ( | Surgical and physiotherapy outpatients of 5 orthopedic hospitals (148) | |||
| Orthopedic surg | 73 | 29.08 | 4.21 | |
| Physiotherapists | 75 | 35.61 | 4.61 | |
| Versluijs et al. ( | Initial musculoskeletal specialist surgery outpatient evaluations | 34 | Not reported | Not reported |
| Perche et al. ( | Subjects with a clinical diagnosis of rosacea in a dermatology clinic. | 30 | Not reported | Not reported |
| Egeli & MacMillan ( | Patients with Fibromyalgia | 190 | 24.22 | 10.66 |
|
|
|
| ||
|
| ||||
|
|
| |||
| Martín-Fernández et al. ( | Patients from six primary care centres in Madrid. | 451 | 39.69 | 7.31 |
| Calderón et al. ( | 560 patients at 15 tertiary referral hospitals in Spain with colorectal (44.3%) or breast (34.3%) cancer. | 560 | 42.0 | 5.10 |
| Eiroa-Orosa et al. ( | Patients in primary care centres with gastroenterology services with congruent views with their physician | 1,002 | 41.84 | 4.25 |
| Patients with incongruent views with their physician | 560 | 41.91 | 4.21 | |
| Sarabia-Tapia et al. ( | Patients with Parkinsons disease in neurological hospital clinic. | 100 | 26.10 | 8.70 |
| Pascual – Ramos et al. ( | Outpatients with rheumatic diseases of a rheumatological clinic. | 600 | 37.34 | 8.4 |
| Peña-Valenzuela et al. ( | Patients with systemic arterial hypertension (SAH) from a family practice in Mexico. | 289 | Not reported | Not reported |
| Torres-Reyes et al. ( | Primary care setting | 115 | Not reported | Not reported |
|
| Weighted | SDpooled | ||
|
| 5.28 | |||
|
|
| |||
| Montanaro et al. ( | Patients of the Neuroscience dept. of University of Torino. | 24 | 37.30 | 7.30 |
| Buizza et al. ( | Patients with breast cancer in three oncology clinics in Northern Italy. | 164 (Question Prompt sheet) | 42.67 | 4.10 |
| Lauricella et al. ( | Patients with neuroendocrine tumors during Wave 1 of the COVID pandemic | 197 | 39.24 W1 | 6.84 |
|
|
|
| ||
|
| ||||
|
|
| |||
| Stefanescu et al. ( | Children and adolescents from Romania with Type 1 diabetes. | 55 | 18.96 (pre ACT intervention) | 10.34 |
|
|
| |||
| Zenger et al. ( | A representative sample of the German population was selected with the assistance of a demographic consulting company | 2,266 | 37.08 | 6.30 |
| Hefner et al. ( | Patients treated with capecitabine at the Comprehensive Cancer Center of the University of Wuerzburg, Germany. | 58 | 39.92 | 5.97 |
| Hefner et al. ( | Cancer patients undergoing chemotherapy with capecitabine at outpatient clinics, day hospitals and doctors’ offices of the Comprehensive Cancer Center of the University of Wuerzburg, Germany. | 64 | 39.87 | 5.76 |
| Engler et al. ( | 7 practices participating in a quality of care improving project named “Cardiovascular offensive”, Luebeck. | 54 | 41.85 | 3.87 |
| Markovic et al. ( | A cross-sectional study in TNB individuals, 18 or older, residing in Austria, and previously or currently undergoing medical transition, of whom 33.0% and 25.3% self-identified as trans men and trans women, respectively, and 41.8% as non-binary. | 91 | 33.55 | 8.83 |
|
|
|
| ||
|
| ||||
|
|
| |||
| Zolotov et al. ( | Adherent patients prescribed medical cannabis for cancer, chronic pain/other | 76 | 35.70 | 3.60 |
| Nonadherent | 17 | 30.30 | 4.60 | |
|
|
|
| ||
|
| ||||
|
|
| |||
| Alghabiwi et al. ( | Female primary care patients with chronic disease. | 253 | 25.92 | 5.84 |
| Al-Noumani et al. ( | Participants were recruited from outpatient specialty clinics across ten governorates in Oman. | 800 | 33.14 | 7.55 |
|
|
|
| ||
|
| ||||
|
|
| |||
| Arafat et al. ( | Participants older than 14 years of the Psychiatry Out-Patient Department (OPD) of Dhaka Medical College Hospital (DMCH). | 50 | 28.50 | 10.20 |
| Arafat et al. ( | Outpatients of Dhaka Medical College and some private practices in the city of Dhaka. | 214 | 38.43 | 8.95 |
|
|
|
| ||
|
| ||||
|
|
| |||
| Wollman et al. ( | Adult users of a Primary Health Care Service in Porto Alegre, State of Rio Grande do Sul, Brazil. | 133 | 40.05 | 6.30 |
| Wollmann et al. ( | Primary care sample in Brasil. | 6,160 | 29.79 | 7.29 |
|
|
|
| ||
|
| ||||
|
|
| |||
| Pitanupong et al. ( | University Hospital patients. | 264 | 39.20 | 5.30 |
| Pitanupong et al. ( | Patients in 4 psychiatric outpatient clinics. | 542 | ||
| Praha et al. ( | Patients with Continuous ambulatory peritoneal dialysis in Thailand. | 159 | 37.90 | 4.30 |
| Sangngam et al. ( | Patients with asthma. | 240 | 37.36 | 6.45 |
|
|
|
| ||
| 663 | 38.22 | 5.54 | ||
|
|
| |||
| Li et al. ( | Outpatients of internet hospitals and physical hospitals in Wuhan ( | |||
| Internet hospital users | 130 | 32.71 | 5.72 | |
| Regular hospital users | 375 | 34.17 | 4.85 | |
| Liang et al. ( | Older veterans | 256 | 34.30 | 5.80 |
| Qiao et al. ( | Patients from one provincial and one city-level general public hospital in Hohhot city, the capital of Inner Mongolia Autonomous Region of People's Republic of China | 713 | 31.36 | 7.56 |
| Zhou et al. ( | Online survey participants in the general population who visited a doctor in the last year. Convenience sample. ( | |||
| Before COVID-19 pandemic. | 1,903 | 34.74 | 6.03 | |
| During the COVID-19 pandemic | 37,62 | 4.59 | ||
| Wu et al. ( | Chinese general hospitals outpatients | |||
| High somatic symptom severity | 248 | 38.03 | 6.25 | |
| Low somatic symptom severity | 236 | 37.01 | 6.91 | |
| Xiong et al. ( | Outpatients from ten outpatient departments at Chinese general hospitals ( | |||
| Low distress | 202 | 38.8 | 5.70 | |
| Somatically distressed | 89 | 38.8 | 5.70 | |
| Psychologically distressed | 51 | 35.1 | 7.20 | |
| Mixed distressed | 149 | 36.0 | 7.20 | |
| Wang et al. ( | General Hospital inpatients of neurology, gastroenterology, obstetrics and gynaecology ward of Peking Union Medical College Hospital, Beijing. | 203 | 33.45 | 4.44 |
|
|
|
| ||
|
| ||||
|
|
| |||
| Maleki et al. ( | Patients in primary health centres in Tehran. | 208 | 32.49 | 9.90 |
| Torabipour et al. ( | Patients in community health centres in Ahvaz city | 200 | 28.50 | 7.91 |
| Ghorrabi et al. ( | Cardiology patients | 35.17 | 7.84 | |
| Gynecology patients | 34.39 | 4.62 | ||
| Ophthalmology patients | 36.29 | 6.03 | ||
| ENT patients | 33.00 | 3.31 | ||
| Orthopedic patients | 31.17 | 7.52 | ||
| Internal patients | 34.30 | 7.15 | ||
| Surgical patients | 32.17 | 7.93 | ||
| Dermatology patients | 34.35 | 6.38 | ||
| Patients in other specialised clinics | 33.81 | 6.98 | ||
| Total group | 275 | 33.92 | 6.80 | |
|
|
|
| ||
|
| ||||
|
|
| |||
| Johny et al. ( | Patients who used traditional and complementary medicine | 906 | 30.60 | 5.99 |
|
|
| |||
| Biyazin et al. ( | Surgical patients at Jimma Medical Center. | 372 | ||
|
|
| |||
| Hassan et al. ( | Patients with essential hypertension | 172 | Not provided | Not provided |
| Item | Mweighted | SDpooled | % Not at all | % Somewhat | % Appropriate | % Mostly | % Totally |
|---|---|---|---|---|---|---|---|
| Item 1 | 3.75 | 0.91 | 2.4 | 6.8 | 44.1 | 23.4 | 23.2 |
| Item 2 | 3.47 | 1.02 | 7.0 | 11.4 | 41.5 | 21.0 | 19.0 |
| Item 3 | 3.77 | 0.93 | 2.7 | 6.7 | 41.1 | 24.7 | 25.2 |
| Item 4 | 3.67 | 0.92 | 2.6 | 6.5 | 43.9 | 23.8 | 23.2 |
| Item 5 | 3.70 | 0.93 | 3.0 | 6.8 | 42.3 | 23.0 | 24.8 |
| Item 6 | 3.63 | 0.93 | 3.4 | 6.9 | 46.6 | 21.3 | 21.9 |
| Item 7 | 3.71 | 0.93 | 2.6 | 6.0 | 44.7 | 22.7 | 24.0 |
| Item 8 | 3.74 | 0.88 | 2.4 | 5.9 | 41.1 | 24.4 | 26.1 |
| Item 9 | 3.54 | 0.97 | 6.0 | 10.0 | 40.9 | 20.3 | 22.6 |
| Raw score | Interpretation | |
|---|---|---|
| RS < 23 | T < 30 | Severely challenged |
| 23 ≤ RS ≤ 30 | 30 ≤ T ≤ 39 | Moderately challenged |
| 31 ≤ RS ≤ 34 | 40 ≤ T ≤ 44 | Mildly challenged |
| 35 ≤ RS ≤ 42 | 45≤ T ≤ 56 | Average |
| 42 < RS | 56 < T | Good |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPatient-Provider Communication in Healthcare · Patient Satisfaction in Healthcare · Cancer survivorship and care
Introduction
The patient-doctor relationship (PDR) has been chiefly explored from the doctor's perspective, that is the primary care practitioner (PCP) or the medical specialist in the general hospital setting. Questionnaires to assess the patient's view on the patient-doctor relationship (PDR) were not available until 2004 when a study describing the development and validation of the Patient-Doctor Relationship Questionnaire (PDRQ-9) was published (1). It explores aspects of the patient-doctor relationship in 9 items. A description of the PDRQ-9 and proposed scoring method can be found in the Supplementary Materials.
This questionnaire was the first one developed to explore the PDR from the patient's perspective in a comprehensive way, as confirmed in a 2012 systematic review of instruments assessing the patient-doctor relationship. It explored aspects such as communication, the consultation process, and trust (2). A difference between the PDRQ-9 and patient satisfaction questionnaires was the decision not to focus on quality of care, technical items, or specific communication patterns. Instead, it focuses on the PCP's helping attitude as experienced by the patient, which was in line with the view that a helping alliance was at the core of a patient's relationship with their doctor. It was initially developed in Dutch, taking the Helping Alliance Questionnaire, that had been translated and psychometrically tested in Dutch, and assesses the therapeutic alliance in psychotherapy, as a starting point (3–5) and adapting it to address specific aspects relevant to a relationship between a patient and their medical doctor. This resulted in a short questionnaire containing nine questions: the PDRQ-9. The psychometric properties of the PDRQ-9 were assessed in collaboration with 110 general practice patients and 55 patients in an Epilepsy Clinic. The first publication on the Dutch PDRQ-9 provided an English translation of the questionnaire, including its instructions, items and response options. This English version was provided after forward and backward translation and discussion, independently, with two native English speakers. A study conducted in the USA, Canada and the UK, two other USA studies, and a study conducted in Nigeria reported on the validity of the English PDRQ-9. Two of those studies also performed confirmatory factor analysis that confirmed the one-factor structure of the English PDRQ-9, as shown in Table 1 (6, 7). This well-studied English version was the starting point for all translations in different languages.
Rationale
Additional research into the psychometric properties of the PDRQ-9 was recommended to investigate its reliability and validity in larger primary care or general hospital samples (1) and compare it to a measure assessing the patient-doctor relationship from the doctor's standpoint, such as the 10-item Difficult Doctor-Patient Relationship Questionnaire (DDPRQ-10) (19). Another recommendation was to explore whether the PDRQ-9 test scores correlate with clinically important outcomes such as higher clinical follow-up rates, patient compliance with treatment recommendations, illness outcomes, or functional status (1). Since its publication, this questionnaire has been adopted widely in the literature on the PDR. It was cited over 1,500 times in Google Scholar and has been translated, psychometrically tested and used globally for research-related evaluations. This abundance of publications on the PDRQ-9 enables us to explore some research questions regarding the abovementioned psychometric properties, and the cross-cultural validity of the PDRQ-9. Cross-cultural validity refers to whether measures that were originally generated in a single culture are applicable, meaningful, and thus equivalent in another culture (20). It has been mainly applied in psychological studies which need to adapt self-reported health status measures for use in languages other than the source language (21).
We aim to provide an overview of the developments with the PDQR-9 over the last 20 years, to summarise the findings, and to illustrate for what purpose the PDRQ-9 can be used. Furthermore, we propose a method for normalising scores and signposting areas of research that might be relevant to explore in the future.
Objectives
1.Explore reliability and validity according to studies of the PDRQ-9 in different languages and countries.2.Explore studies reporting norm and cut-off scores for an above or below-average PDR.3.Propose a novel scoring method for the PDRQ-9 based on the findings.
Methods
Eligibility criteria
Studies presenting data on the PDRQ-9 that used 9-item versions of the PDRQ-9 and the scoring method of those items were eligible for inclusion. Studies were excluded if they were not peer-reviewed, did not provide outcome data for the PDRQ-9, did not use the recommended scoring method or 9 items, developed another questionnaire based on the PDRQ-9, were a protocol publication, or were retracted.
Information sources
We conducted a search for articles in Pubmed, Medline, Web of Science, and Google Scholar, with the last date at the end of October 2025.
Search strategy
We used the terms “patient-doctor relationship questionnaire” OR “PDRQ-9” OR “PDRQ9” OR “PDQR” (as some studies had reversed the letter order of the abbreviation) published between the publication date of the original study (1) in March 2004 up to November 2025. No language restrictions were used. We checked studies for references to find other studies fulfilling the inclusion criteria. In addition, we consulted a database of PDRQ-9-related studies that was put together over the years by one of the authors (CFC). The search string is mentioned at the end of the document.
Selection process
Search and screening were conducted independently and in duplicate by the two authors (CFC or EdB) and an assistant, and in case of doubts, they were discussed until consensus was obtained. We outlined the selection of studies in a Flowchart.
Data collection process
All performed data extraction, and in case of doubt, they discussed it until a consensus was reached. We queried 16 authors for additional data or clarification. Six authors (17, 22–26) responded and provided the requested information.
Data items
We extracted data from the studies exploring countries and medical settings where the studies were conducted, the type of healthcare professionals who were evaluated with the PDRQ-9, patient populations in which the PDRQ-9 was administered, the method of language translation, if the translated version was provided, any validation or reliability outcomes, scoring method, and any outstanding issues relevant for further research mentioned in the publications. We also summarised findings based on the PDRQ-9 in studies where an already psychometrically tested version was used, like Mean, SD, total scores, and cut-off levels*.*
Risk of bias
The authors (CFC, EdB) assessed the risk of bias using an adapted tool for cross-sectional studies (27), focusing primarily on selection (description target group), sampling method (representative or convenience sample), valid measurement of PDRQ-9 outcome, and selective reporting. Both authors discussed the criteria and assessed the studies in duplicate, examining cases of doubt together.
Effect measures
M and SD, cut scores, norm scores.
Synthesis methods
We present psychometric and factor structure findings and total scores, calculating weighted Means and SD__pooled_ over studies. We present an approach to convert raw sum scores to standardised, normalised scores. The most recent guideline on reporting transcultural validity states that it is not only important to consider semantically correct translation and conceptual equivalence. Testing of reliability and construct validity, psychometric analysis and field studies should also be conducted (28). We have structured the findings accordingly. Therefore, Table 2 reports on translation method, such as forward-backward translation (40) and psychometric findings, such as any reliability or validity findings; Table 1 reports on factor analysis, and Table 3 shows field studies with Means and SDs. Weighted mean item scores and SD__pooled_ are then reported in Table 4. Cut-off values for the quality of the patient-doctor relationship based on normalised T-scores are reported in Table 5. Mean scores are presented in Supplementary Table A1 of the Supplemental Materials. A crosswalk table and formulas for converting PDRQ-9 raw scores into normalised T-scores and Percentile scores is presented in Supplementary Table A2 in the Supplemental Materials.
Data preparation
Some authors sum the item responses to obtain a total score on the PDQR-9, while others calculate the mean score of the items. To present scores in the tables on the same metric, we multiplied the latter mean scores and SDs by 9 to obtain a total score. We calculated weighted Mean Scores and SD__pooled_ as an estimate of the SD in the total group, taking the N of the study into account, and produced a forest plot per study, per language group, and for the total group. As the scores on the PDRQ-9 appeared skewed, with most patients rating their doctors positively, we presented an approach to convert raw sum scores to standardised, normalised scores based on a large German representative sample dataset of the general population (11).
Reporting bias assessment
We report the risk of bias in Supplementary Table A3 in the Supplemental Materials.
Certainty assessment
We asked the authors of the three potentially representative samples for an anonymised subsample of their dataset to explore the possibility of converting raw sum scores to normalised scores and received two samples: the German (11) and the Brazilian (Portuguese) (71) one. The German sample was the only representative general population sample and was used to establish a conversion table and formula, as described below.
Norms for the PDRQ-9 with normalised T-scores
PDRQ-9 total scores were first normalised with a percentile rank order-based approach [Rankit normalisation (18, 78, 79) with the RankNorm function in the R RNOmni package version 1.0.1 (80)]. This function first transforms raw scores to percentile ranks, based on the formula PR = (CF -.5F)/N (CF = the cumulative frequency of the raw score; F = - the frequency of the Raw score; N = the size of the normative sample). Next, the percentile rank order scores were converted to standard scores (Z-scores) with the probit function, resulting in Z-scores with M = 1 and SD = 1). These Z-scores were converted to T-scores with T = 10Z + 50 to obtain a more convenient metric with a M = 50 and SD = 10 (81). According to the empirical rule for the normal distribution, 68% of the population will score in the range of T = 40–60, 95% in the range of T = 30–70, and 99.7% in the range of 20–80. Once the T scores have been established, the curvilinear relation between raw scores and normalized T-scores can be estimated with non-linear modelling (the Non-linear Least Squares nls function in the R Stats (version 3.6.2) package and the glsnls package version 1.2.0 (https://cran.r-project.org/web/packages/gslnls/gslnls.pdf). The full procedure is described elsewhere (82). A crosswalk table and formula were established to convert raw PDRQ-9 scores to normalised T scores. We followed the Prisma statement guidelines for systematic reviews and report the PRISMA checklist in the Supplementary Materials (83).
Results
The search resulted in 1,179 studies. Another 103 studies were identified via alternative searches, totalling 1,282 studies that used questionnaires and discussed the PDR. After 48 duplicates were removed, 1,281 studies remained. After screening, a total of 1,168 studies were excluded. 1,058 were excluded as they discussed the PDR but did not report on the PDRQ-9. Fifty-four not peer-reviewed studies were excluded. Fifteen studies (36, 84–97) developed a different questionnaire based on the PDRQ-9. Seventeen studies (2, 98–112) did not or insufficiently provide data on the PDRQ-9. Twelve articles were a study protocol (43, 113–122). Eleven studies (123–133) used a scoring method different from the recommended scoring method for the PDRQ-9. Finally, one publication was retracted due to not having ethics approval for the study (134). This resulted in 66 studies as indicated in the Flowchart (Figure 1).
PRISMA 2020 flow diagram for systematic reviews.
Table 2 provides details of the 25 studies that adapted the PDRQ-9 in 15 different languages and collected psychometric information from 10,321 patients in a range of healthcare settings. The studies reported predominantly on construct and content validity. Reliability was mostly explored by Cronbach's alpha.
Healthcare settings and patient populations
Regarding settings and patient populations, the psychometric studies were conducted with 4,555 primary care patients (1, 6, 8, 11, 13, 16, 17, 31, 34, 35, 37); n = 2,275 of these were from a representative general population sample (11). Eight studies were conducted in 2,338 general hospital outpatients; one in patients with cancer (9), one in patients with cancer or chronic pain receiving medical cannabis (33); one in patients with type 2 Diabetes Mellitus (10); one in patients of internet hospitals (15), one in surgical patients (30); one in orthopaedic patients (22); one in patients with Continuous Ambulatory Peritoneal Dialysis (39) and one in patients with asthma (24). Three studies were conducted with 623 outpatients of specialised mental health institutions (12), one of them specifically for patients with schizophrenia (7) and one for patients with depression (38). Two studies reported on 258 general hospital inpatients: one specifically on neurology, gastroenterology, obstetrics and gynaecology inpatients (14), and one on inpatients in a specialised epilepsy clinic (1). Furthermore, psychometric properties of the PDRQ-9 were evaluated in 82 parents of children presented at a paediatric clinic (29) and in 190 online patients in the general population with fibromyalgia (32).
Construct validity – treatment settings
Construct validity was evaluated by comparing the PDRQ-9 score in different treatment settings. Primary care patients scored significantly better than epilepsy inpatients of a specialised epilepsy clinic (mean difference = .46, p = .001) (1). Primary care patients who were treated by faculty physicians scored higher than those who were treated by resident family doctors (p = .01) (6). There was a correlation between the PDRQ-9 score and duration of treatment (rpmcc = 0.114, p = 0.044) and number of previous hospitalizations (r_pmcc_ = 0.179, p = 0.002), and a negative correlation with the Morisky Adherence Scale (r_pmcc_ = 0.174, p = 0.002) in outpatients with stable schizophrenia (7) and the PDRQ-9 score was found to be significantly predicting adherence to medical cannabis in patients with chronic pain visiting a clinic for medical cannabis prescription (p < 0.01) (33).
Construct validity – comparison with other questionnaires
Several studies compared the PDRQ-9 score with another questionnaire for the PDR, such as the Difficult Doctor-Patient Relationship Questionnaire (DDPRQ-10), a scale that assesses the doctor's view of the PDR, with which the PDRQ-9 score was correlated negatively (r = −.22, p = .003) (6); the Trust in physician scale (r = 0.413, p < 0.001), with which it correlates positively (7); and satisfaction with pain treatment (r = .51) (11). These correlations suggest moderate, acceptable construct validity.
Reliability
All studies found the questionnaire reliable, with Cronbach's alpha ranging from 0.62 to 0.97 (mean 0.91). Test-retest reliability ranged from 0.61 over two months to 0.74 after seven days.
Factor structure
Table 1 provides factor structure findings of 14 studies, including 11 of the validation studies (1, 7–17, 29, 135).
All studies confirmed the one-factor model of the PDRQ-9 except one study, which found a two-factor solution for the Chinese version. The first factor reflects empathy and trust experienced by the patient during treatment, and the second factor represents the patients' appraisal of the process and results of their treatment (14).
Field studies with total scores
Table 3 provides the total scores of the PDRQ-9 and other findings from 62 publications describing field studies conducted with psychometrically tested versions of the PDRQ-9. They involved 25,865 patients from 24 countries: Austria, Bangladesh, Brazil, Canada, China, Germany, Ethiopia, Iran, Israel, Italy, Malaysia, the Maldives, Mexico, Nigeria, Oman, Romania, Saudi-Arabia, Spain, Taiwan, Thailand, the Netherlands, Turkey, the United Kingdom and the USA. All studies conducted are shown; in case studies do not show total scores, weighted Mean and SDs are reported.
Most studies were based on convenience samples in clinical settings and one Chinese study reporting on a convenience sample derived from an online survey conducted via internet and social media (75). In addition, some studies were performed in representative samples. Four publications covered one representative general population sample in Germany (11, 61–63), one study used a systematic random sampling method resulting in 372 surgical patients in Ethiopia (30) and one Brazilian study was nested within a national cross-sectional study conducted to assess the quality of care provided by the Brazilian Primary Health Care (PHC) system, with units being selected in each municipality using a systematic sampling methodology (71).
Language versions and countries where the studies were conducted
Six studies in 957 patients were conducted with the Dutch version (1, 29, 41–44). Five studies of those reported total scores in 785 patients. Thirteen studies in 1,920 patients were conducted with the English version: eight in the USA (6, 22, 46–49, 52, 53); one in the UK; (45) in Canada, the USA and the UK (32); in Nigeria (7), Saudi Arabia (50), and India (51). Ten studies of those reported total scores in 1,490 patients. Seven studies were conducted in 3,677 patients with the Spanish version; six studies used the version psychometrically tested by Martín-Fernández et al. (2,010) (8) and were conducted in Spain (8, 9, 37) and in Mexico (23, 54, 55). One study (56) used the Mexican Spanish translation (10). Five studies of those reported total scores in 2,788 patients. Three Italian studies in 545 patients used the Italian version (57–59). The Romanian version was used in a study conducted in Romania in 55 children and adolescents (60). Eight studies encompassing 2,542 patients used the German version. Four of them reported on the same German representative general population sample rating their primary care doctor (11, 61–63); three more studies were conducted in other German samples (64–66) and one study was conducted in Austria (67). Five studies of those reported total scores in 2,542 patients. One study on the Hebrew version was performed in 93 patients in Israel (33). Two studies with the Arabic version were performed in 1,053 patients in Saudi Arabia (68) and Oman (69). The Persian version was conducted in three studies in 683 patients in Iran (16, 76, 77). Two studies reporting on the Bangla version in 264 patients were performed in Bangladesh (32, 70). The Thai version was used in four studies conducted in 1,205 patients in Thailand (14, 24, 62, 72). Two studies of those reported total scores in 423 patients. The Malay version was used in one study conducted in 906 patients in Malaysia (13). And the Maldivian version was used in the Maldives in 172 patients (9). The Chinese version was used in seven studies in 4,555 patients: five conducted in China (14, 15, 25, 26, 75), one in the Inner Mongolia Autonomous Region of People's Republic of China (74) and one in Taiwan (73). The Afaan Oroma version was used in one study in 372 patients in Ethiopia (30). Two studies on the Portuguese version were conducted in 6,921 patients in Brazil (15, 71).
For 23,343 patients from 50 publications describing 63 samples in 14 languages, the overall weighted mean for the total score on the PDRQ-9 was M_weighted_ = 34.21, SD__pooled_ = 6.70. A forest plot based on total scores involving those 63 samples is shown in Figure 2.
Forest plot with weighted means and pooled DSs for the total score per study grouped per language/country.
As can be seen in Figure 2, the vertical line in the middle is the weighted total mean of all studies. The N per language is reported based on studies reporting an M and SD. The vertical striped lines on the left and on the right represent the SD limits of the weighted total means. The plot is arranged per country and the first dot per country represents the total for that country. As can be seen in the figure, the weighted total mean for the Dutch, English, Hebrew and Chinese versions correspond with the weighted total mean of all studies. The Spanish, Italian, Bangla and Thai version have a higher weighted total mean score. The other countries have a lower weighted total mean score.
Ceiling effect
Many studies found that the PDRQ-9 presented discrimination problems in the upper part of the scale (8, 9, 11, 23, 38, 56, 61–65, 72). They found a clear left-skewed distribution, indicating that most values were concentrated on the right of the mean. This reveals a strong tendency to use the highest response options of the PDRQ-9 items and rate one's doctor positively. This led to skewed item scores and a skewed instrument's sum score. The latter is shown in Supplementary Figure A1 in the Supplementary Materials with a histogram for PDRQ-9 total scores of the German sample (11). The left side of Supplementary Figure A1 shows the distribution of PDRQ-9 total scores. Based on Rankit normalisation we converted the PDRQ-9 total score to normalised T-scores, whose distribution is shown on the right side in Supplementary Figure A1.
The relation between raw scores on the PDRQ-9 and T-scores under the normal distribution is shown in Figure 3.
Normal distribution with Z-scores, T-scores, and raw scores on the PDRQ-9 total scale.
The resulting PDRQ-9T-score metric is especially apt at distinguishing severity levels among the mildly to severely challenged PDRs in the lower realm.
Item scores
19 Studies provided mean item scores of 14,486 patients (1, 7–9, 11, 12, 15, 16, 49–51, 54, 68, 70, 71, 73, 75–77) and 6 studies provided frequencies of response item options (1, 11, 50, 51, 70, 71). An overview of weighted mean item scores and SD__pooled_ is shown in Table 4.
The table shows the skewed representation of the PDRQ-9 item scores. Higher response options (“Appropriate”, “Mostly”, and “Totally”) were more frequently chosen than lower response options (“Not at all” and “Somewhat”). Furthermore, Table 4 shows data on the frequency of endorsed response options for a total of 14,697 patients as provided by twenty-one studies. (1, 7–9, 11, 12, 15, 16, 41, 49–51, 54, 68, 70–73, 75–77).
The weighted mean scores and pooled SD's for items are shown in Supplementary Table A1.
Cut-off scores
Several studies suggested single cut-off values for a good PDR: 31 in Brazil (13, 71); >33.57 in Mexico (23); >34.5 in Ethiopia (30); and ≥39.69 in Spain (8), which is a much higher value than the others.
Other studies suggested cut-off values for distinct levels of severity, defining good and low PDR scores as 29 and ≤22 in Mexico (55); as ≥43.4 and ≤30.4 in the USA (53); as ≥36; 18–35 (moderate PDR); and ≤17 in Saudi Arabia (50) and Thailand (38, 72); as >36, 22–35 (moderate PDR) and ≤21 in Iran (77); ≥33, 21–32 (moderate) and ≤20 in the Maldives (35).
One study, conducted in Iran, suggested four levels with cut-off values: ≥38 for excellent “communication skill”, 29–38 for good, 19–28 for moderate, and 9–18 for weak “communication skill” (76).
Based on these reported cut-off scores and given the ceiling effect in the data of the studies, resulting in skewed sum scores on the instrument, distinguishing levels above average and providing a cut-off score is hard to do. Therefore, we provided norms that are especially useful for distinguishing levels below average or challenged PDRs. To this effect, we transformed raw scores from the German general population sample (11) to normalised T-scores and present these for all possible raw scores in the supplementary materials (Supplementary Table A2). This is a crosswalk table to convert raw PDRQ-9 scores to percentile rank order scores, normalised T-scores, and various levels of a challenged PDR. These T scores can also be established with a conversion formula (T = 0.61 + 1.885RS-0.03865RS^2^ + 0.0006151*RS^3^, a 3rd-grade polynomial), have a mean of M = 50.0 and SD = 10.0, and are equidistant. Scores below 30 (raw score <23) indicate an exceptionally challenged relation, reported by <2% of the general population. The Supplementary Table A2 presents a crosswalk table for PDRQ-9 raw scores to T-scores and Percentile Rank order (PR) scores.
Based on these normalised T-scores we recommend the following cut-off values as shown in Table 5: A cut-off value of T ≤ 44 for a challenged PDR, 45 ≤ T ≤ 56 for an average PDR, and T > 56 for a good PDR.
The studies give a range of cut-off scores for severely challenged PDR ranging from 17.00 to 39.69. The relation between raw scores, percentile ranks, and T-scores is also depicted in Supplementary Figure A2. T-scores are on the x-axis, percentile rank-order scores are on the Y-axis, and raw scores are shown in the figure with a white background on the sigmoid curve. Scores in the green area (RS ≥ 35) represent average to good PDRs, and scores in the red area (RS ≤ 30) indicate challenged PDRs. The figure shows how the raw scores in the white squares correspond to T-scores on the X-axis and percentile scores on the Y-axis. The red area indicates challenged PDR scores.
The scoring instructions and formula for conversion of raw scores to standardised, normalised scores is included in the Dutch and English version of the PDRQ-9 in the Supplementary Materials.
Several studies explored challenged PDRs in the following groups as indicated by lower total scores.
Medically unexplained symptoms
In terms of medically unexplained symptoms, one study in primary care found that MUS patients reported significantly higher levels of loneliness and a lower quality of the PDR than older patients with explained medical symptoms (p = 0.002) (46). When taking into account comorbid depressive and anxiety disorders, only the association between MUS and the quality of the PDR remained statistically significant. The effect size of the association was small, with a beta of 0.179–0.191. p = 0.005–0.009. A Chinese study found that patients with mixed somatic and psychological distress had the worst PDR (M = 36.0; SD = 7.3). In addition, compared with patients with somatic distress alone, those with psychological distress alone had a worse PDR (73). Lower than average scores were found in fibromyalgia patients (32), and in primary care patients with persistent unexplained symptoms in primary care with high rates of comorbid anxiety and depressive disorders (41).
Mental health conditions
Regarding mental health conditions, average PDRQ-9 scores were lower in a study in the Netherlands in the specialised mental health setting (43). The PDRQ-9 was found to be negatively correlated with the PHQ-9 (r = −0.196 to −0.309) and there were significant differences in PDRQ-9 ratings between general hospital in-patients with or without depressive symptoms (p = 0.019) in a Chinese study. This correlation was independent of age (14). Modest positive correlations were found of the PDRQ-9 with functional (p < .001) and global quality-of-life (p < .001), and negative correlations of the PDRQ-9 with psychological distress (p < .001) in oncology patients receiving treatment in specialist general hospital centres. Also, low PDRQ-9 scores for the PDR with the oncologist were associated with more anxiety (p = .006) and depression (p = .004) (9). In a general population sample, adjusted for age, the mean PDRQ-9 score of participants with a potential depressive disorder was M = 3.66 (SD = .86), and that of participants without a potential depressive disorder was M = 4.12 (SD = .66) (F = 65.8, p.001), a difference with a small effect size (Partial eta2 = .05). Experiencing somatic and depressive symptoms emerged as the most relevant patient-related predictors of the quality of the PDR with the family physician (11).
Treatment factors
In terms of factors influencing treatment, in a Dutch study, the PDRQ-9 rating of the quality of the PDR was the only variable among patients' characteristics which was significantly and positively correlated (Pearson rho = −0.49; P < 0.01) with decisional conflict (42). Two other studies corroborated that the only factor associated with decisional conflict was the PDR (46, 47). A good understanding of the patients' illness was linked to a better relationship with their doctor and greater participation in shared decision making which was associated with increased trust and adherence to medication in another study (45). A study in psychiatric patients of Hospitals in Istanbul found a correlation between irregular attendance of the mental health services, and lower PDR (135). A Spanish study found a positive association (r = 0.38); p < 0.001) between consultation time and PDRQ-9 score: for every minute increase in consultation time, the PDRQ-9 score increased in 0.04 units (p = 0.001, CI95%: 0.016–0.063) (56).
PDRQ-9 scores were positively correlated with medication adherence rates for 3 months of treatment (r = 0.52; P = 0.006). Subjects in a US study who perceived a weaker PDR (less than or equal to 36, n = 10) were less adherent over 3 months, with an average adherence rate of 45%, compared with subjects who perceived a stronger PDR (greater than or equal to 37, n = 17), with an average adherence rate of 70% (P = 0.03). Adherence did not vary by age or gender (P = 0.59 and 0.51, respectively) (53). Several other studies found a positive correlation between PDRQ-9 score and medication adherence (35, 69). Regarding online treatments, internet hospital users who received online consultations rather than face to face gave significantly lower scores than regular hospital users (P = .01) (15).
Gender
A Spanish study found that men had significantly higher PDRQ-9 scores than women (9). In Saudi Arabia, a female sample with chronic disease had an overall level of the PDR that was less than moderate. The PDR had a significant influence on the patients' self**-efficacy in self-**managing their chronic diseases (p = 0.047) (68). A study involving trans and non-binary patients found that non-binary participants showed a lower score (M = 30.79; SD = 7.46) than trans women (M = 34.78; SD = 9.17) and trans men (M = 36.10; SD = 9.45), a significant difference in means (F = 3.52, df = 2, p = 0.034) (67). The gender of the physicians was reported in none of these studies.
Age
The number of studies exploring the role of age in the PDR was limited. One study in primary care found no significant correlation of the PDR with age, health, or psychological distress (6); another study found higher age (OR 1.03, 95% CI: 1.02–1.05) was associated with above average PDR (8). However, given the OR of 1.03, this effect can be considered to be clinically irrelevant.
Discussion
Over the last 20 years, 66 studies used the PDRQ-9. Twenty-five studies translated and psychometrically tested the PDRQ-9 in 15 languages and 10,321 patients in healthcare settings ranging from primary care to outpatient and inpatient general hospital care, including specialty mental healthcare, and patient populations, providing psychometric information.
In general, the main psychometric aspects to be evaluated, as listed by the COSMIN guideline are: Content and Construct Validity, Structural Validity, Cross-cultural Validity, Responsivity, Test-retest Reliability, Internal Consistency Reliability and standard error of measurement (136). Content and Construct Validity of the PDRQ-9 is supported by the original Dutch study and the research with the English translation. Structural Validity, that is, factorial invariance between subgroups, was supported by five studies (9, 11, 14, 26, 31). Criterion validity, e.g., whether patients with lower PDRQ-9 scores have lower compliance with treatment, or less adherence to medication, is supported by sixteen studies (7, 17, 33, 35, 38, 45, 53, 55, 64, 69, 74, 96, 102, 103, 129, 135). Cross-cultural validity was the prime research question in all studies investigating translation of the PDRQ and was mostly supported by replicating the positive psychometric properties of the Dutch original. In all studies that explored the factor structure but one, the PDRQ-9 is reported to consist of one factor, and the internal consistency reliability is excellent.
In view of the requirements suggested in the COSMIN guideline (136), current gaps in the literature regarding psychometric research of the PDRQ are test-retest reliability that is only reported by five studies (1, 6, 7, 9, 14); and responsiveness, indicating if the score changes after an intervention aimed at improving the relationship, which is reported by one study (43). These gaps could be welcome topics for future research.
Although the PDRQ-9 predates the distinction between Patient-Reported Outcome Measures (PROMs) and Patient-Reported Experience Measures (PREMs), it is best considered a PREM, as it was specifically developed to assess the PDR as experienced and reported by the patient, which is a key aspect of the patient experience. It was conceptualised as a reflective measurement model in which all items are a manifestation of the same underlying construct, namely the PDR (136).
Cross cultural validity of the different language versions
Based on Tables 1–3, the following can be resumed regarding cross-cultural validity of the various language versions of the PDRQ-9.
The original Dutch language version's validity was supported by psychometric analyses, its factor structure reported, and means and SDs reported in five out of six field studies. The English version was produced by forward – backward translation and consultation of native speakers; and psychometrically tested in USA, Canadian, UK and Nigerian studies. The one-factor structure of this English version was confirmed by confirmatory factor analysis performed in two of those studies, one in the USA and one in Nigeria (6, 7). This version was used in 11 field studies, 10 of which reported means and SDs. The original published English version can therefore be considered as extensively cross culturally psychometrically tested.
The German version was developed by forward backward translation and applied in a representative sample from the general population. Internal consistency and construct validity were explored and a CFA performed, which confirmed a one-factor structure. Eight field studies were executed, four of them in the general population sample and four in convenience samples, reporting means and SDs. Furthermore, the German version was at the basis of one of the Chinese language versions that was used in two studies, after an extensive translation and psychometric testing procedure (25, 26).
The Spanish version was developed in Spain, with three studies reporting reliability, two reporting validity and factor analysis. One Mexican Spanish version was also developed and investigated with CFA. Three field studies were performed in Spain and four in Mexico, reporting means and SDs.
The Thai version was developed by forward backward translation and CFA reported. Four field studies were done with three reporting means and SD. The Bengali version was developed by forward backward translation in Bangladesh. Factor analysis confirmed one component. Two field studies reported total means and SDs. The Persian version used forward backward translation, performed factor analysis and three field studies reported means and SDs. The Brazilian Portuguese version performed forward and backward translation and expert consultation, performed factor analysis and two field studies reporting means and SDs. The Arab version was made by forward-backward translation. The study was performed in Egypt and reported Cronbach's alpha but no factor analysis (31). Two field studies reported means and SDs, one in Saudi Arabia and one in Oman (68, 69).
Three Chinese versions have been used where it was unclear how these were translated (73–75). Li et al. reported translation method, Cronbach's alpha and factor structure indices aligning with a one-factor structure (15). Wang et al. developed a Chinese version by an extensive forward backward translation process and found support for a two factor structure with highly correlated factors (14). The authors note that their study is the only one reporting two factors and suggest that this might be related to a less individualistic culture in China compared to other countries, which may make Chinese people more apt to not only consider the actions of their doctor in the appraisal of the PDR, but also the contextual circumstances provided by the hospital. This study is the only one showing a cross-cultural difference in factor structure in all the studies performed with the PDRQ-9 so far, differentiating that Chinese version from the Li study. Seven field studies reported means and SDs: all in China, one in Taiwan and one in Mongolia, two of which were developed in collaboration with a German research group (25, 26). Those authors confirmed in writing that they followed an extensive translation process and that CFA confirmed a one-dimensional factor structure and high reliability, which was however only partly published. The remaining studies (73–75) used translations with unknown factor structure.
The following language versions may need further validation and field studies to be conclusive: The Turkish version was developed by expert involvement and CFA reported; another study also reported EFA. However, no field studies reporting means and SDs were done. The Ethiopian study developed an Afaan Oromo Version by forward backward translation and explored internal consistency, but did not report factor analysis or a field study with total means and SDs (30). The Hebrew version used forward and backward translation and reported construct validity and reliability, and one field study with means and SDs. The Malay and the Maldivian versions do not provide results regarding the factor structure. Three studies report on means and SDs of an Italian version of which development, validity and factor structure is not described; one Romanian study also describes means and SDs but no other information regarding development and validity of the Romanian version was provided.
In summary, this shows extensive cross-cultural validity for the factor structure in 9 language versions, and 7 language versions that are well underway and for which more work may be needed. In that respect, it is relevant to mention that an officially translated Italian version is currently developed in a study funded by the European Union. The picture emanating from this shows good cross cultural validity. In total, studies on the PDRQ-9 encompass four out of five continents with overwhelmingly similar structure and results.
Based on our findings, it seems that one-dimensionality of the PDRQ-9 is cross-culturally supported. One Chinese study does not support that, however, the correlation between the two factors that they find is very high (14). The other Chinese studies either do report a one-factor structure (15, 25, 26), or do not report a factor structure at all (73–75).
Another aspect that warrants attention is the variability in mean scores across studies. These differences may be attributable to variations in study populations (e.g., general population vs. clinical samples), characteristics of health care systems, translation-related issues, and broader cultural factors. Given the present data, it is difficult to disentangle the relative contributions of these factors. Therefore, norming based on representative samples from the general population is recommended.
Total scores of the PDRQ-9 are provided in 50 publications describing studies conducted in 63 samples, 23,343 patients and 24 countries, using psychometrically tested versions of the PDRQ-9. The overall weighted mean for the total score on the PDRQ-9 was Mweighted = 34.21, SD_pooled = 6.70. The weighted mean average varies between countries, with scores higher than average for studies conducted in the UK and USA, the Netherlands, Germany, Spain, the Northern part of Italy, and Thailand. This may reflect good health services in those regions compared to means in Hebrew, Chinese, Persian, and Romanian studies, which corresponded more to the weighted mean of the entire set of studies. Studies with the Arab, Romanian, Brazilian (Portuguese), Mexican and Malay versions yielded lower weighted mean scores. There are several possible explanations for this.
First, although most studies were performed with thoroughly translated and psychometrically tested language versions, the weighted total means scores of the Arab version seem to be afflicted by one study that was performed before the rigorously developed Arab version was published. The mean score reported in that study was substantially lower (68). This may be related to the quality of that translation, or with the population of patients with chronic medical conditions. The same may apply to the Romanian study, as it has an unclear translation process; however, given that the Romanian version clearly shows improvement in PDRQ-9 score after an intervention actually underscores the validity of this language version (60).
Second, it could reflect variations in health care service. For example, the M and SDs in the Brazilian sample are much lower than in other studies using the PDRQ-9. Methodological factors could play a role given that in this rural sample, comprising many illiterate patients from impoverished neighbourhoods, the PDRQ-9 was administered by interview, which might make people lean more toward average answers rather than at the high end of the scale, as in other studies that used the PDRQ-9 as a self-report measure. It is good to know that the interview method of administration allows for the valid appraisal of the PDR in deprived areas, but this difference in assessment may have influenced the completion of the PDR. Furthermore, relevant cultural and socio-economic factors could be that this sample contained many people who were illiterate and from impoverished neighbourhoods. The potential influence of such aspects of the primary healthcare system under study in Brazil on the PDR is captured effectively by the PDRQ-9.
However, it could also indicate PDR differences among patient populations. The findings support the known group validity of the PDRQ-9. Average scores tend to be lower in patients with chronic medical conditions (68) including Parkinsons disease (54). For example, the lower scores for the Arab version and the Mexican version might be partially explained by that.
Medically unexplained symptoms (MUS) and comorbid depressive and anxiety disorders, including fibromyalgia patients, are also found to be associated with lower scores (25, 32, 41, 44).
Dissatisfaction with the PDR in these patient groups is a long-standing issue, with many studies initially reporting on this from the perspective of doctors coining such patients as “worried well”, “difficult”, or “heartsink” patients (137–139). Most interventions in these patients would follow a cognitive behavioural treatment approach (140) which would be effective in patients willing and able to follow this treatment. However, studies show that overall the effect size of this approach tends to be small (141) and CBT provision as primary treatment was met over the years with growing discontent of patient groups, with a more recent development being studies with patient-input leading to redefining MUS as medically not yet explained symptoms (MNYES) (142) and emphasizing the need to study aetiology of syndromes, such as for example in chronic fatigue syndrome (143). Given this, the finding that the PDRQ-9 total score is lower than average in several studies in patients with MUS underscores the validity of this questionnaire and the relevance to explore the patient perspective of the PDR in the context of research. This may be supported by the finding of the lower score in the Malay study conducted in people visiting complementary medicine providers, possibly because of dissatisfaction with their regular healthcare providers (34).
Regarding patients with mental health issues, there seems to be a clear effect on the PDR in case of depression or anxiety, especially for patients who receive treatment from medical specialists in the general hospital, for example oncology patients (9); in specialty mental health patients with schizophrenia in Nigeria (7), and in patients with somatoform disorders in the Netherlands (41).
Implications of the study
Our findings show that PDR is considered a relevant factor for patient well-being, compliance with treatment, and healthcare delivery in various countries, healthcare settings, and medical specialisations. The PDRQ-9 is especially apt in identifying a challenged PDR, which is the most relevant domain, making it a valuable tool for improving this critical aspect of care. We recommend reporting total scores rather than item scores and applying the normalised T-scores in future studies of the PDRQ-9 and in clinical practice. It could be beneficial to use the PDRQ-9 to screen for potential issues in the PDR in patient groups struggling with unexplained symptoms or mental health issues. This is more relevant as the PDRQ-9 score is associated with other factors influencing treatment, such as decisional conflict, medication adherence, and patients' self-efficacy in managing their chronic disease. The PDRQ-9 can also discern the quality of the PDR in online treatment settings (15) which indicates potential usefulness in evaluating the PDR as experienced by patients in e-Health interventions with the PDRQ-9.
Gender matters for how the PDRQ-9 is completed: Men rate their PDR significantly higher than women. In addition, non-binary participants have lower scores than trans women and trans men, who score average mean levels. However, so far, studies exploring gender have not taken the gender of doctors into account, which we recommend for future studies. Research should also explore which other doctor characteristics are associated with the PDRQ-9 score.
We presented a standardised scoring approach based on a large representative general population sample. This standardisation produces a normalised score that makes the raw scores that are cluttered due to the ceiling effect interpretable as it normalises the scale to equidistant scores. We present this as an example of how raw scores can be normalised. We suggest to apply this standardisation normalisation method of total scores, and the reported cut-scores, when using the PDRQ-9.
Furthermore, we recommend to perform general population studies to confirm norm scores in countries with mean scores diverting from the mean as displayed in Figure 2. New language versions of the PDRQ-9 should preferably be developed using the forward backwards translation method, and report on psychometric aspects including the factor structure.
Limitations of the study
Limitations of this study are that there was only one representative sample of the general population. Most studies used convenience samples, although they reported on relevant clinical samples. Few studies have reported on gender differences among patients or potential relevant physician factors affecting PDR, such as educational levels, years of experience, gender, and age.
Although the single-factor structure is supported in most primary studies included in this systematic review, a direct cross-cultural comparison or Differential Item Functioning (DIF) analysis of configural, metric, and scalar factorial Invariance among translated versions for different nations or cultures is lacking. Only one of the primary studies assessed configural, metric and scalar DIF within one dataset/country (14), but none have done this across nations or cultures. We recommend inclusion of the PDQR-9 in future cross-country surveys such as the People Living with Chronic Conditions study (PaRIS) (144, 145), the Commonwealth Fund's International Health Policy survey (146), or other international initiatives, as this would secure representative data in many countries, and provide a basis for further research into normative data and generalizable results.
Another limitation concerned the utilization of the Risk of Bias tool. Assessing cross-sectional studies involves unique Risk of Bias considerations, and a recent review found that none of the currently available tools comprehensively address all potential biases pertinent to cross-sectional studies (27). Following the recommendations from that review, we developed a tool specifically for this systematic review. Biases that we should address were representativeness of the sampling frame, non-sampling error from nonresponse or coverage, and whether the outcome, disease or characteristic under study was measured validly and reliably, which in this case was operationalised by the level of validation of the respective translated versions of the PDRQ-9 in the selected studies. This allowed us to establish that Risk of Bias was low in most studies. However, risk of bias was moderate in studies with the Italian, Thai and some Arab versions, which limits the generalisability of the results for those language versions. Nevertheless, all studies show similar results.
Furthermore, there is a lack of uniformity in how the PDRQ-9 is scored: most researchers calculate the recommended sum score, but some calculate the mean score of the nine items. We recommend reporting the sum score in the future to avoid this inconsistency. Finally, the PDRQ-9 has a substantial ceiling effect, translating into a diminished capacity to discriminate among patients who score high on the PDRQ-9. Cut-off scores, therefore, can best be made for lower scores reflecting challenged PDRs and based on normalised T-scores. We recommend applying the normalised T-scores in clinical use and future studies of the PDRQ-9.
Strengths of the study
The strengths of this study are the large number of studies, the large total sample, and the diversity of the samples regarding cultural background, with four out of five continents of the world represented. The proposed normalisation of scores is innovative, and the use of normalised sum scores enhances the interpretability of PDRQ-9 sum scores even in case of ceiling effects.
Conclusion
The PDRQ-9 is a brief tool that effectively assesses the quality of the doctor-patient relationship from the perspective of the patient. Its factor structure has been reported across cultures, it is easy to administer, and shows strong reliability. Its scores are linked to better treatment adherence and outcomes in a variety of medical settings and patient populations. It is used worldwide and clearly fulfils a need. Studies were performed in primary care and various specialist general healthcare and mental healthcare settings in up to 24 countries. Construct validity and criterion validity are well supported, with correlations between the PDRQ-9 score, a good understanding of the patient's illness, greater participation in shared decision-making, and treatment adherence. Ceiling effect was reported. However, the PDRQ-9 provides a good measure of the PDR when total scores are transformed to a metric with a normal distribution, yielding helpful information, especially when the PDR is challenged. Transformed, normalised total scores should be reported preferably over mean scores given the provided cut-off score levels and for consistency in the international literature. Given the importance of the PDR in health services, the PDRQ-9 is a highly relevant measure. Research exploring doctor's factors playing a role in the PDR is recommended.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Van der Feltz-Cornelis CM Van Oppen P Van Marwijk HW De Beurs E Van Dyck R. A patient-doctor relationship questionnaire (PDRQ-9) in primary care: development and psychometric evaluation. Gen Hosp Psychiatry. (2004) 26(2):115–20. 10.1016/j.genhosppsych.2003.08.01015038928 · doi ↗ · pubmed ↗
- 2Eveleigh RM Muskens E van Ravesteijn H van Dijk I van Rijswijk E Lucassen P. An overview of 19 instruments assessing the doctor-patient relationship: different models or concepts are used. J Clin Epidemiol. (2012) 65(1):10–5. 10.1016/j.jclinepi.2011.05.01122118265 · doi ↗ · pubmed ↗
- 3Alexander LB Luborsky L Greenberg LS Pinsof WM. The penn helping alliance scales. In: Greenberg L Pinsof W, editors. The Psychotherapeutic Process: A Research Handbook. New York: The Guilford Press (1986). p. 325–66.
- 4Horvath A Gaston L Luborsky L. The therapeutic alliance and its measures. In: Miller NE Barber JP Docherty JP, editors. Psychodynamic Treatment Research: A Handbook for Clinical Practice. New York: Basic Books (1993). p. 247–73.
- 5Van der Linden MHM. De verstandhouding tussen cliënt en therapeut: een empirisch onderzoek naar de kwaliteit van de therapeutische werkrelatie en effect bij psychodynamische psychotherapie. Academisch proefschrift, universiteit van Amsterdam. (the relationship between therapist and client: an empirical investigation of the quality of the therapeutic relationship and the effect of psychodynamic psychotherapy. Academic dissertation) (1994).
- 6Porcerelli JH Murdoch W Morris P Fowler S. The patient–doctor relationship questionnaire (PDRQ-9) in primary care: a validity study. J Clin Psychol Med Settings. (2014) 21(3):291–6. 10.1007/s 10880-014-9407-225106125 · doi ↗ · pubmed ↗
- 7Aloba O Esan O Alimi T. Psychometric qualities of the 9 item patient doctor relationship questionnaire in stable Nigerian patients with schizophrenia. J Behav Health. (2015) 4(4):112. 10.5455/jbh.188856 · doi ↗
- 8Martín-Fernández J del Cura-González MI Gómez-Gascón T Fernández-López E Pajares-Carabajal G Moreno-Jiménez B. Satisfacción del paciente con la relación con su médico de familia: un estudio con el Patient-Doctor Relationship Questionnaire [Patient satisfaction with the patient-doctor relationship measured using the questionnaire (PDRQ-9)]. Aten Primaria. (2010) 42(4):196–203. Spanish. 10.1016/j.aprim.2009.09.02620116893 PMC 7024422 · doi ↗ · pubmed ↗
