Psychometric properties and norm values of a short screening version of the profile of mood states POMS from the German general population
Katja Petrowski, Monika Bjelopavlovic, Markus Zenger, Elmar Brähler, Bjarne Schmalbach

TL;DR
This study confirms the reliability and validity of a shorter version of the POMS mood assessment tool in a German population sample.
Contribution
The study provides new normative data and confirms the structural validity of the POMS-16 in a representative German sample.
Findings
The four-factor model of the POMS-16 showed acceptable fit and good reliability.
Partial strict invariance was found between gender and age groups.
Normative percentile ranks were reported for the POMS-16.
Abstract
The current study aimed to provide further evidence of the structural validity of the 16-item-short version of the Profile of Mood States (POMS), a widely-used tool for assessing an individual’s emotional state. This is significant for various research inquiries in clinical and social psychology. In order to cross-validate previous findings, an additional evaluation of the factorial structure and the psychometric properties is necessary in a newly collected dataset. A representative sample for age and gender of N = 2503 with 1329 (53%) female, 1173 (47%) male, and 1 (< 1%) diverse, with a mean age of M = 46 (SD = 18) was collected. The model fit for the four-factor model was acceptable, with good reliability for all factors. We found evidence for (partial) strict invariance between gender and age groups. There were small to moderate group differences for the Anger and Vigor subscales…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —Universitätsmedizin der Johannes Gutenberg-Universität Mainz (8974)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health Research Topics · Child and Adolescent Psychosocial and Emotional Development · Stress Responses and Cortisol
Introduction
The Profile of Mood States questionnaire (POMS) is a widely used questionnaire to assesses the mental state of various medical patients, including those with heart surgery, cataracts, epilepsy, and sleep apnea syndrome^1–4^. POMS has also been used in (psycho)oncology to evaluate various outcomes, such as quality of life, stress levels or the impact of interventions^5^. More frequently, POMS measures psychological and pharmacological quantities in clinical, as well as in occupational and sports medicine studies^6–8^.
The original American version of POMS^9,10^ consists of six mood dimensions, assessed with 65 items rated on a 5-point scale. The mood dimensions Anger-Hostility, Confusion-Bewilderment, Depression, Fatigue-Inertia, and Tension-Anxiety are summed up and Vigor-Activity is subtracted from the sum to obtain the Total Mood Disturbance Scale. Internal consistency coefficients range from 0.90 to 0.94, and retest reliabilities range from 0.65 to 0.74 in clinical samples^10^. Similar values were reported non-clinical samples, with higher retest reliabilities over a one-week interval^11^.
In other versions POMS utilizes two main prompts on a 7-step answer scale: 'How have you felt during the past week including today?' and 'How do you feel right now?' The ‘past week’ framing is usually preferred to the ‘right now’ framing as it measures reoccurring states and at the same time retains sensitivity to interventions. And although McNair et al.^10^ replicated the original structure using the ‘right now’ framing, it should be evident that perceptions of intensity, seriousness, and frequency of episodes may vary with the reference period. In Anglo-Saxon contexts, scores from the ‘past week’ instruction were higher than those from multiple ‘right now’ assessments. Terry et al.^12^ recommended using the ‘right now’ prompt due to recall being influenced by mood and significant events.
Although the internal structure of the Profile of Mood States (POMS) is well-validated, the Confusion factor is regarded as a cognitive state. Some adaptations have retained the Friendliness component due to the restricted scope of pleasant mood states encompassed^13,14^.
Other factors that can affect mood state responses, such as diverse mood state descriptors, response formats, and assessment circumstances have been discussed in previous research. Especially, the circumstances of mood state assessment, including the timing and location, are crucial measurement elements^12,15,16^. Overall, instruction type and test administration conditions both play important roles and the invariance of mood scores needs to be ascertained in the interpretation of mood states^12,16,17^.
The 35-item version of the POMS (German version by Biehl et al.^18^) generally appears to be the most widely used and it consists of the four scales: dejection/anxiety, fatigue, vigor, and anger, evaluated using a 5- or 7-point response scale. Satisfactory psychometric properties were reported for this version based on a student sample. However, when applied to a general population sample, only a limited satisfactory factorial structure was observed^19^. Therefore, Petrowski et al.^20^ conducted an item selection for a robust factorial structure. Using exploratory factor analysis and model comparisons of potential item subsets^21^, a four dimensions scale with total of 16-item set was identified, ensuring good reliability and factorial structure. Confirmatory factor analysis showed a good fit and high reliability for the subscales (0.86 to 0.91). This 16-item short version is strictly invariant across age groups, with strong and partial strict invariance by sex.
This 16-item short version was developed from an older dataset of the long version. To ensure its validity and independence from the excluded 19 items of the POMS-35, further evaluation of the factorial structure and psychometric properties is necessary using a newly collected dataset exclusively implementing the 16-item version. Thus, the current study aims to evaluate the factorial structure and psychometric properties of the 16-item short version with up-to-date norm values from a representative sample of the German general population.
Method
The present investigation was part of a representative survey of the German general population. An independent institute for opinion and social research (USUMA, Berlin, Germany) organized and carried out the data collection. Participants were required to be at least 14 years of age and sufficient German-speaking capabilities. In addition to providing socio-demographic information, participants completed several self-report questionnaires on physical and psychological symptoms. The study participants were selected by means of a random-route sampling method with 258 sample points. Initially, 5418 households were selected, and 5389 were deemed eligible for participation. The Kish selection grid^22^ was then utilized to select individuals within households. In total, 2503 individuals took part in the survey (46% of those contacted). The study protocol was approved by the ethics committee of the University of Leipzig (043/20-ek) and adhered to ICH-GCP guidelines, the ICC/ESOMAR International Code of Marketing and Social Research Practice, and the Declaration of Helsinki. After being educated about the study procedures, data collection, and anonymization of personal data, all participants gave verbal informed consent, in accordance with German law.
Instruments
In the present study the short version of the Profile of Mood States (POMS^23^) by Petrowski et al.^20^ was implemented. The short version consists of 16 items spread across 4 scales: dejection / anxiety (4 items), fatigue (4 items), vigor (4 items) and anger (4 items). Similar to the different English versions the 16 items were answered on a 7-point Likert scale and evaluated for "the last 24 h".
Statistical analysis
All analyses were performed in R using the packages dynamic, lavaan, and semTools^24–26^. The analysis script is published at https://github.com/bschmalbach/POMS_Validation/blob/main/POMS16.Rmd. For the confirmatory factor analysis, we utilized robust full-information maximum likelihood estimation (FIML). We made this decision because there were a sizable number of respondents with missing values (mostly singular values): 143 (5.7%) individuals had at least one missing value. Thus, by using FIML, we were able to use all available information and not selectively include responses based on missingness. Furthermore, the items exhibited only a slightly positive skewness of 0.62, and thus robust maximum likelihood estimation appears acceptable. To assess the model fit, we considered the Comparative Fit Index (CFI), the Tucker-Lewis Index (TLI), the Root Mean Squared Error of Approximation (RMSEA), and the Standardized Root Mean Squared Residual (SRMR). We employed cutoffs of 0.95 for CFI and TLI, and 0.08 for RMSEA and SRMR^27,28^. To supplement, this analysis we also considered the dynamic cutoff values as proposed by McNeish and Wolf^29^. To evaluate scale reliability, we examined McDonald’s ω^30^.
To analyze differences between sociodemographic groups, we first tested for measurement invariance by comparing increasingly restrictive models: configural invariance (baseline model), metric invariance (equal loadings across groups), scalar invariance (equal loadings and indicator intercepts across groups), and strict invariance (equal loadings, indicator intercepts, and indicator residuals across groups). The fit should not decrease by more than 0.01 in terms of CFI and RMSEA between models^31,32^. To supplement these analyses, we report RMSEA_D_ according to Savalei and colleagues^33^. Its interpretation is equivalent to the standard RMSEA index with the customary 0.08 cutoff. For the ANOVA comparisons and normative data, only respondents with complete data on a given subscale were included in the analysis. We conducted ANOVAs to check whether there are meaningful differences in POMS scores between sociodemographic groups. This served the primary purpose of determining the necessity of dedicated norm values for each subgroup. In addition to conducting the ANOVAs, we checked for variance homogeneity and normality of residuals—both assumptions were fulfilled or only violated mildly. The normative values were then computed based on percentile ranks for each given sum score.
Results
Sample
The initial study sample consisted of 2503 responsdents. Of those, 1329 (53%) were female, 1173 (47%) were male, and 1 was diverse (< 1%). The mean age was 46 (SD = 18), which we split into three even age groups: ≤ 36 years (n = 824, 33%), 37–55 years (n = 808, 32%), and > 55 years (n = 871, 35%). A more detailed description is reported in Table 1.Table 1. Sample description.n%Sex2503 Male117346.9 Female132953.1 Diverse10Age groups2503 Young (1st tertile)82432.9 Middle-aged (2nd tertile)80832.3 Old (3rd tertile)87134.8Relationship status2487 Married, living together98139.4 Married, separated692.8 Unmarried99239.9 Divorced29611.9 Widowed1496Education2496 Not graduated (yet)1365.4 Less than 10 years50720.3 10 years109043.6 More than 10 years76330.6Net household income per month, in €2319 Up to 1,0001948.4 Up to 1,50032814.2 Up to 2,50058525.3 Up to 3,50053623.1 Up to 5,00045319.5 More than 5,0002239.6
CFA
The 4-factor model established by Petrowski et al. (2021) exhibited acceptable fit in this sample when compared to the customary fixed cutoffs, χ^2^(98) = 819.66, p < 0.001, CFI = 0.957, TLI = 0.947, RMSEA = 0.056, SRMR = 0.040. Only the TLI slightly falls below the threshold of acceptability. The dynamic cutoffs largely replicate these findings: Level 1 misspecified models would yield SRMR = 0.070, RMSEA = 0.047, and CFI = 0.977, Level 2 models would yield SRMR = 0.077, RMSEA = 0.068, and CFI = 0.959, and Level 3 misspecification would yield SRMR = 0.084, RMSEA = 0.088, and CFI = 0.940. This puts our empirically determined fit values pretty convincingly into Level 2 (with SRMR being better than expected). This corresponds to “fair” fit according to McNeish and Wolf. Reliability (ω) of the factors was good, ranging between 0.859 and 889.
Additionally, we analyzed a unifactorial model—since the POMS is often summarized using a Total Mood Disturbance Score. Our findings show that such a model is completely unacceptable, χ^2^(104) = 6339.45, p < 0.001, CFI = 0.621, TLI = 0.563, RMSEA = 0.155, SRMR = 0.126, ω = 0.716.
Measurement invariance
Table 2 shows the results of the measurement invariance tests. For gender, there were no meaningful differences in the measurement model at any level. That is, strict invariance can be assumed. In contrast, age groups exhibited some differences with regard to the residuals. Thus, we can only assume partial strict invariance. Specifically, we freed the residuals of the two items with the largest deviations between the age groups: “full of pep” and “vigorous”. This indicates that while the vigor factor may be comparable across ages (given metric and scalar invariance), the error terms aren’t—which is an indication for differing reliability between groups.Table 2. Tests of measurement invariance.χ^2^Δχ^2^dfΔdfpCFIΔCFI**RMSEAΔRMSEARMSEA_D_Gender invarianceBaseline model914.50196.957.056Metric invariance921.757.2520812.841.958.001.054.002.000Scalar invariance971.8950.1422012 < .001.956.002.054.000.036Strict invariance989.6717.7823616.337.954.002.052.002.007Age invarianceBaseline model1029.07294.958.056Metric invariance1067.0137.9131824.035.958.000.055.001.015Scalar invariance1260.18193.1734224 < .001.949.009.058.003.053Strict invariance1524.03263.8537432 < .001.933.016.063.005.054Partial strict invariance^a^1415.24155.0637028 < .001.939.010.060.002.043^a^The residuals of items 15 and 16 were allowed to vary between groups for this model.
Group differences and norm values
As can be seen in Table 3, there were small-to-moderate differences between age groups with regard to the Anger and Vigor subscales. Specifically, younger respondents reported higher values in both subscales. Some significant yet smaller (R^2^ < 0.01) differences were observable for other comparisons as well. Even though these differences are relatively small, they demonstrate the need for dedicated normative values for each subgroup. Tables 4, 5, 6, 7 display normative values for the German general population.Table 3. Group comparisons.FpR^2^Anger Gender0.24.627 < .001 Age groups25.08 < .001.020Fatigue Gender19.97 < .001.008 Age groups7.45 < .001.006Vigor Gender2.36.125 < .001 Age groups19.17 < .001.016Dejection Gender13.16 < .001.005 Age groups5.80.003.005Table 4Normative percentile ranks, anger subscale.MaleFemale ≤ 3637–55 > 55 ≤ 3637–55 > 55419.127.520.32226.616.3528.237.128.931.937.824.2637.845.234.738.847.831.7745.355.143.345.456.640.3851.962.949.154.663.345.6961.668.453.759.370.551.81067.472.559.566.474.258.51170.776.566.370.48163.81275.180.871.973.883.868.61380.983.376.776.886.773.9148486.981.381.888.478.21585.988.984.883.79081.51689.892.787.389.19385.91791.294.990.4919587.5189296.792.493.195.490.61993.99793.994.395.992.62096.19894.996.596.993.82196.79995.797.297.294.72298.199.79797.99895.92398.610098.298.198.797.12498.927.598.798.899.197.42599.737.199.299.399.398.32610045.299.710099.898.62719.155.11002210099.32828.262.920.331.926.6100Table 5Normative percentile ranks, fatigue subscale.MaleFemale ≤ 3637–55 > 55 ≤ 3637–55 > 5547.211.412.47.210.98.5513.817.916.811.215.313619.623.223.216.122.316.9727.932.830.922.430.124.2835.640.738.1283630.2940.347.545.433.14036.51047.552514148441152.857.655.748.354.8501257.764.662.953.859.656.31363.37168.35963.558.91470.476.573.563.669.9631575.180.177.667.674.568.61680.185.682.274.679.974.9178488.186.378.183.478.31886.791.289.981.888.482.61989.592.79384.491.386.22092.594.794.188.39388.62194.295.795.989.594.890.82294.59697.791.496.192.32395.39798.792.896.794.42496.498.59994.49896.12597.899.299.29798.597.32698.910099.597.998.998.32799.211.499.798.699.398.82810017.9100100100100Table 6Normative percentile ranks, vigor subscale.MaleFemale ≤ 3637–55 > 55 ≤ 3637–55 > 5541.41.81.31.41.80.251.72.51.51.63.30.7633.82.62.14.51.273.94.32.83.16.21.984.76.63.34.28.52.996.97.94.96.610.94.1109.18.96.68.9146.81111.311.79.211.517.49.71215.215.312.51522.713.61319.622.117.419.228.716.91424.627.52324.635.420.81529.332.327.632.240.127.81636.541.234.539.447.238.51743.650.939.644.655.244.8185057.350.153.162.652.51962.266.958.66269.9602073.878.96872.879.370.2217986.573.777.285.775.82284.890.380.182.289.582.62388.493.184.186.993.186.92495.996.990.894.896.9922597.598.594.497.798.295.42697.899.297.498.698.498.12799.299.798.599.399.39928100100100100100100Table 7Normative percentile ranks, dejection subscale.MaleFemale ≤ 3637–55 > 55 ≤ 3637–55 > 55434.13532.627.832.423.3547.148.541.939.844.634.1656.854.849.448.653.343.8761.863.557.656.76052.9868.770.865.163.266.559.4974.874.269.36971.364.41078.778.273.673.875.270.41183.782.278.677.379.874.51284.584.884.2818378.61387.387.587.184.785.782.71489.889.888.688.78884.6159292.891.289.890.286.31693.494.893.591.793.788.91795.396.895.194.29591.3189798.296.995.495.992.81997.598.897.296.59793.82097.899.598.79797.894.52198.699.899.297.798.595.22299.210099.598.198.795.72399.43510098.898.996.92499.748.532.699.899.398.32510054.841.910099.698.82634.163.549.427.810099.32747.170.857.639.832.499.52856.874.265.148.644.6100
Discussion
The "Profile of Mood States" (POMS^10^) is a widely used questionnaire in clinical research. For epidemiological studies, short instruments with strong psychometric properties are essential. Therefore, a short screening version was derived from the long form. To cross-validate those results in a sample featuring only the final 16 items, a representative sample of the general population in Germany was collected. The study aimed to evaluate the psychometric properties, factorial structure, and norm values of the 16-item short version of the POMS.
Based on the newly collected representative dataset, the model fit for the four-factor model was acceptable, with good reliability for all factors. We found evidence for (partial) strict invariance across gender and age groups. Small to moderate differences were observed for the Anger and Vigor subscales regarding age. Further, results from the unifactorial CFA discourage from the usage of the Total Mood Disturbance Score. The English version of the POMS by Cella et al.^34^ consists also of a small number of items (11 items) but only provides a Total Mood Disturbance score without subscales. Therefore, the 16-item version provided here is the shortest available version of the POMS that maintains subscale measurements. Furthermore, it previously showed a high correlation with the long 35-item version^20^.
While the large sample size and broad age range are strengths of this study, the representativeness for the general population may limit applicability to samples with altered moods, such as clinical settings. To further validate or refute the factorial structure, the POMS should be applied to diverse groups, including clinical samples. In addition, further consideration should be given to the process of external validation, specifically the usage of concrete criteria (such as behaviors) which the POMS could predict.
In sum, this investigation presents evidence of the POMS-16’s structural validity and reliability. It can be recommended for social, personality, and clinical research interested in changes in affect and mood.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Baker, F., Denniston, M., Zabora, J., Polland, A. & Dudley, W. N. A POMS short form for cancer patients: psychometric and structural evaluation. Psycho-Oncology J. Psychol. Soc. Behav. Dimens. Cancer 11(4), 273–281 (2002).10.1002/pon.56412203741 · doi ↗ · pubmed ↗
- 2Fletcher, M. A., Lopez, C., Antoni, M., Penedo, F., Weiss, D., Cruess, S. & Klimas, N. G. A pilot study of cognitive behavioral stress management effects on stress, quality of life, and symptoms in persons with chronic fatigue syndrome (2011).10.1016/j.jpsychores.2010.11.010PMC 307370621414452 · doi ↗ · pubmed ↗
- 3Lochbaum, M., Zanatta, T., Kirschling, D. & May, E. The Profile of Moods States and athletic performance: A meta-analysis of published studies. European journal of investigation in health, psychology and education, 11(1) (2021).10.3390/ejihpe 11010005 PMC 831434534542449 · doi ↗ · pubmed ↗
- 4Mc Nair, D. M., Lorr, M. & Droppleman, L. F. Manual for the profile of mood states (POMS). San Diego: Educational and Industrial Testing Service (1971).
- 5Mc Nair, D., Lorr, M. & Droppleman, L. Profile of mood states manual (rev.). San Diego: Educational and Industrial Testing Service (1992).
- 6Morfeld, M., Petersen, C., Krüger-Bödeker, A., Von Mackensen, S. & Bullinger, M. The assessment of mood at workplace-psychometric analyses of the revised Profile of Mood States (POMS) questionnaire. GMS Psycho-Soc. Med. 4 (2007).PMC 273653419742299 · pubmed ↗
- 7Schultze, M. stuart: Subtests Using Algorithmic Rummaging Techniques. R package version 0.10.2 (2023). https://CRAN.R-project.org/package=stuart.
- 8Mc Nair, D., Lorr, M. & Droppleman, L. F. Profile of mood states: Bipolar form. Educational and Industrial Testing Service: San Diego, CA (1988).
