Assessing nonresponse bias in a 30-year study of gulf war and gulf era veterans
Joseph Gasper, Wendy Van de Kerckhove, Talia Spark, James McCall, Carly Mihovich, Heather Hammer, Aaron Schneiderman, Michele Madden, Erin K. Dursa

TL;DR
This study examines how nonresponse bias affects health data from a long-term study of Gulf War veterans, finding that adjustments reduce bias in key variables.
Contribution
The study introduces a method to assess and adjust for nonresponse bias in longitudinal veteran health data.
Findings
Response rates remained relatively high in Wave 4, with older, White, deployed, and married veterans more likely to respond.
Weighting adjustments reduced bias in demographic and military characteristics, but alcohol and drug dependence may still be underestimated.
Nonresponse adjustments were effective for key variables, supporting continued insights into long-term health effects.
Abstract
Cohort studies of veterans are critical for understanding the long-term health effects of deployment and toxic exposures. However, longitudinal research is susceptible to attrition and potential nonresponse bias. The Gulf War Era Cohort Study (GWECS) is the largest and longest-running longitudinal cohort study of 1990–1991 Gulf War veterans. In this paper, we identify demographic and military service characteristics associated with patterns of response over time and examine the extent to which accounting for nonresponse bias in Wave 4, conducted more than 30 years after the Gulf War, might impact the estimates of health conditions. Multivariate multinomial logistic regression analysis was used to identify demographic and military service characteristics associated with response patterns over time (always responder, current responder, past responder, never responder). To adjust for…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —U.S. Department of Veterans Affairs
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFibromyalgia and Chronic Fatigue Syndrome Research · Posttraumatic Stress Disorder Research · Occupational Health and Performance
Background
Cohort studies of veterans are extremely important for understanding the long-term effects of deployment and military environmental exposures on physical and mental health. However, longitudinal research is susceptible to attrition and potential nonresponse bias. Nonresponse bias is a function of both the overall nonresponse rate and the degree to which respondents and nonrespondents differ on the statistic of interest [1]. Several studies have shown that a low response rate may not necessarily produce substantial nonresponse bias if the differences are minimal or statistically negligible [2–4]. Nevertheless, nonresponse bias should be examined and addressed whenever any concerns about the representativeness of respondents arise.
Several factors are associated with a higher likelihood of follow-up response in general population studies, including female sex, older age, higher education attainment, and being married [5, 6]. Few studies of follow-up with military service members and veterans have published robust nonresponse analyses. The King’s Centre for Military Health Research Health and Wellbeing Cohort Study, a long-term study on the physical and psychological health of United Kingdom military personnel who deployed to Iraq or Afghanistan, found that responders at 20 years were more likely to be female, older age, reserve personnel at baseline, officer rank, had or currently served in the Royal Air Force (versus the Army), and reported symptoms of depression or anxiety [7]. The Millennium Cohort Study (MCS), which primarily includes post-9/11 veterans, found that consistent follow-up response over 15 years after entering the cohort was associated with higher educational attainment, being married, female sex, older age, and deployment [8]. Evidence for health-related predictors of response in the MCS is mixed. One analysis reported that life stressors, mental health conditions, and physical health diagnoses were associated with higher response, whereas an earlier analysis identified smoking, alcohol consumption, and depression as predictors of nonresponse [8, 9]. Nevertheless, nonresponse weighting in the MCS suggests that nonresponse did not substantially bias estimates of health outcomes [8, 9].
Most longitudinal studies report response comparisons at a single point in time, but response can be intermittent, with individuals responding to some waves but not to others. Several studies have classified participants into typologies based on their response histories, showing that intermittent responders tend to fall demographically between consistent responders and never responders [10]. This pattern suggests that longitudinal nonresponse exists along a continuum from severe to less severe [11, 12]. Understanding individual patterns of participation over time is important for two reasons. First, identifying who participates is essential for effectively and efficiently allocating data collection procedures and resources. In adaptive designs, recruitment and retention materials strategies can be tailored toward groups at highest risk for nonresponse, and resources can be directed to improving response rates among these groups [13]. Second, understanding nonresponse patterns can lead to insights into potential biases and help to test the validity of assumptions underlying study analyses.
The GWECS is a population-based, longitudinal cohort study of 15,000 veterans who deployed to the Persian Gulf between 1990 and 1991 (Gulf War veterans) and 15,000 Veterans who served elsewhere during the same period (Gulf Era veterans). Sponsored by the U.S. Department of Veterans Affairs (VA), the cohort has been surveyed in three previous waves of the GWECS, conducted in 1995, 2005, and 2012. The fourth wave of the study was conducted in 2024–2025 and included follow-up with all surviving, non-incarcerated members of the original cohort. The primary goal of the GWECS is to compare changes in self-reported health outcomes of Gulf War veterans with Gulf Era veterans over time, with a focus on chronic medical conditions, mental health conditions, functional impairment, and healthcare utilization. Additional objectives include characterizing the natural history of Gulf War illness as the population ages, assessing the impact of deployment-related exposures on long-term health, and examining overall health trajectories in both the deployed and non-deployed groups. The purpose of this study is to: (1) examine differences among groups with different response patterns to reveal potential sources of bias and inform recruitment and retention strategies (2) determine the extent to which adjustments for nonresponse bias in the fourth follow-up influence estimates of key health conditions. This study provides information that can inform study design, nonresponse adjustment, and analysis in other longitudinal studies of veterans.
Methods
Sampling
A permanent, population-based panel of 15,000 Gulf War veterans and 15,000 Gulf Era veterans was built using a stratified sampling design. Gulf War veterans were sampled from the 693,826 U.S. troops identified by the Defense Manpower Data Center (DMDC) as deployed to the 1990–1991 Gulf War Gulf War Era veterans were sampled from 800,690 persons (half of all personnel, those who were in the military service between September 1990 and May 1991) identified by DMDC as having served during that time but who did not deploy [14]. Both groups had representation from all service branches (Air Force, Army, Marines, Navy) [14]. To ensure adequate sampling of key subgroups, the sampling strategy for both the Gulf War and Gulf War Era groups included intentional oversampling of women. Women were oversampled to comprise 20% of the sample. The National Guard and reserves were oversampled to comprise about 27% and 33%, respectively. The remaining 40% of the sample consisted of active duty [14]. The veterans invited to participate in the 2024 follow-up study were all 26,580 living panel members of the original sample of 30,000 contacted in 1995.
Data collection
The survey used a multimodal design that included web, mail, and computer-assisted telephone interviewing (CATI). The adaptive design strategy utilized veterans’ historical response patterns to tailor data collection protocols [15]. Veterans were classified into two groups based on historical response patterns. The majority of veterans received a sequential multimodal protocol, designed for veterans who responded to at least one of the previous waves of the survey. An invitation letter was mailed to this group, inviting them to complete the web survey using a unique personal identification number. Non-responders received a reminder postcard about the web survey and up to three mailed questionnaire packets. These packets contained a 16-page scannable structured health questionnaire, a preaddressed and prepaid return envelope, an informed consent form, a study brochure that included telephone numbers for study information, the VA Crisis Line number, and an information sheet that highlighted results from previous waves of the study. The invitation letter was sent to panel members in September 2024, accompanied by a reminder postcard mailed one week later. Three mailings of paper questionnaires occurred at 5, 8, and 13 weeks after the invitation letter. In November 2024, CATI calls were made to the veterans who had not responded to the web or paper survey. A smaller group of veterans who never responded to previous waves of the survey received a web-only protocol. Veterans in this group were mailed an invitation letter inviting them to complete the web survey, followed by a reminder postcard, three reminder letters, and a final reminder postcard. The VA Central Institutional Review Board and the Westat Institutional Review Board approved the study protocol and all documents. All data collection was completed by January 2025.
Measures
Completion status
Completion status was determined based on the number of items completed on the survey. A complete questionnaire was defined as any questionnaire with ≥ 80% of the question items answered, and a partially complete questionnaire was defined as one with > 50% but < 80% of the question items answered. In total, 12,187 met the criteria for completed questionnaires, and 190 for partially completed questionnaires. Both partially and fully completed questionnaires were considered respondents.
To investigate response patterns, Wave 4 eligible respondents were categorized into four mutually exclusive groups based on their response patterns across waves. Always responders were defined as veterans who responded to all four waves. Never responders responded to none of the waves. Current responders responded to Wave 4 but did not respond to one or more previous waves, and past responders responded to one or more previous waves but did not respond to Wave 4.
Demographic and military service characteristics
Demographic and military service characteristics were obtained from military records. They included deployment status (deployed, nondeployed), age (17–25, 26–32, 33–39, 40 and older, unknown), race/ethnicity (White, Black, Hispanic, other), sex (male, female), marital status (married, single, other, unknown), service branch (Air Force, Army, Marine Corps, Navy), unit component (active duty, National Guard, reserves), and rank in 1991 (enlisted, officer, warrant officer). These variables were all measured in the sampling frame at the time of the 1990–1991 Gulf War.
Health outcomes
To evaluate the effectiveness of weighting in reducing potential nonresponse bias, several key outcomes were examined, including general health, posttraumatic stress disorder (PTSD) diagnosis, Gulf War illness (GWI) diagnosis, alcohol use, and cigarette use. Self-reported health was measured with a single question asking individuals to characterize their health as excellent, very good, good, fair, or poor [15]. Diagnosis of PTSD, GWI, alcohol or drug dependence, bipolar or manic depression, and Alzheimer’s disease or dementia were assessed in the survey by the following question: “Has a doctor or other health professional ever told you that you have any of the following conditions?’’ “Yes” responses were considered to have the condition, while “no” or missing responses were treated as “No.” Questions on frequency and quantity of alcohol use included “How often did you have a drink containing alcohol in the past year?” and “How many drinks did you have on a typical day when you were drinking in the past year?” Cigarette use was based on the number of days the respondent smoked and the number of cigarettes per day. Socio-economic factors included – educational attainment (high school or below, some college or associate’s degree, bachelor’s degree, graduate or professional degree) and household income.
Statistical analyses
Modeling nonresponse
Multinomial logistic regression was used to identify typologies of respondents with different response patterns and to examine demographic and military service characteristics associated with responses across all four waves. Both unadjusted and adjusted odds ratios (ORs) with 95% confidence intervals (CIs) were estimated, using always responders as the reference category. To understand differences between Wave 4 responders and nonresponders, response rates were compared by demographic and military service characteristics using chi-square test measures of association.
To identify and adjust for nonresponse bias at Wave 4, classification and regression tree (CART) analysis was conducted using the SAS^®^ HPSPLIT procedure. CART allows for a multivariate analysis of nonresponse, capturing interactions between the predictor variables and dividing the sample into groups based on response rate. It is easy to interpret and visualize, and the resulting groups can be used as cells in weighting adjustments to reduce bias. A classification tree was generated with response status as the dependent variable and the frame variables as predictors. The default method in HPSPLIT was applied, which employs a purity measure to divide the eligible sample into groups based on response rate, where the predictor variables form the groups. To prevent over-fitting, the tree was pruned by identifying the subtree that minimized cost-complexity, defined as a function of the misclassification rate (cost) and the size of the tree (complexity) [16].
Development of weights
The analysis used base weights that allow inference from the stratified sample of 30,000 veterans to the broader population of Gulf War veterans and Gulf Era veterans. These base weights accounted for differential sampling rates by strata and were calculated as the inverse of the selection probability. Using groups identified from the classification tree, respondents were weighted to represent nonrespondents within the same group, thus aligning the weighted respondent distribution with that of the eligible sample (nonresponse adjustment weight). Post-stratification was then applied to adjust the survey weights of respondents so that the weighted sample distribution matched the known population distribution for branch, unit component, deployment status, and sex. The final weights were applied to the 12,377 veterans who completed Wave 4 (final weight).
The effectiveness of the weighting in reducing nonresponse bias was evaluated in four ways. First, estimates of frame variables for the eligible sample were compared to those for respondents weighted by the base weights, nonresponse adjustment weights, and final weights. The first comparison provided a way of assessing if the characteristics of respondents differed from the full eligible sample, that is, if there was nonresponse bias in the estimates of frame variables prior to weighting adjustments. The other comparisons indicated whether the weighting adjustments were effective in reducing the bias. T-tests with adjustments for multiple comparisons were used along with calculations of percentage reduction in bias to determine how much the bias was reduced between the estimates weighted by the base weights and the final weights. The percentage reduction in bias is calculated as the absolute value of the bias before any weighting adjustment minus the absolute value of the bias after the final adjustment, divided by the absolute value of the bias before any weighting adjustment and multiplied by 100. The absolute value of the bias before weighting is the difference in the estimate weighted by the base weights and the eligible sample. The absolute value of the bias after weighting is the difference in the estimate weighted by the final weights and the eligible sample.
Second, the strength of the relationship between frame variables and health outcomes at Wave 4 was measured. The effectiveness of nonresponse bias adjustment depends on how strongly the characteristics used in the adjustment are correlated with survey outcomes [17]. For each health outcome, a logistic regression was fit using nonresponse adjustment cells and post-stratification cells as predictors. A proxy correlation, defined as the square root of the pseudo-R-squared was calculated; values closer to 1 indicate a stronger relationship and greater effectiveness in reducing potential bias in the health outcome.
Third, estimates of the health outcomes were compared before and after weighting adjustments. Changes after weighting suggest that the adjustments reduced nonresponse bias, although the actual magnitude of the existing bias prior to and after nonresponse adjustment is unknown.
Finally, a level-of-effort analysis was conducted to evaluate the potential for nonresponse bias in health outcomes. This approach assumes that late responders are similar to nonrespondents, and so any differences between the late and early respondents would be an indication of nonresponse bias [18–20]. For this purpose, “early responders” were defined to include the first 20% of veterans who completed the survey (among those with ≥ 80% of the question items answered). The “late” responders are the 20% of veterans who responded at the end of the field period. Logistic regression models, using final weights, were used to predict health outcomes. Predictors in the regressions included the indicator for “late” responder and demographic and military service characteristics.
Results
Patterns of response over time by group
Of the total Wave 4–eligible sample (n = 26,468), there were 4,671 (17.7%) always responders, 7,706 (29.1%) current responders, 9,516 (36.0%) past responders, and 4,575 (17.3%) never responders. After adjustment, the never responder and the past responder groups differed most from the always responder group (Table 1). Predictors of membership in the never responder group included not being deployed, being male, younger age, being Black, Hispanic, or other race/ethnicity, being enlisted, single/other marital status, being in the Navy, and being in the National Guard or reserve. The same variables, except for being in the National Guard/reserve and being male, predicted membership in the past responder group. In general, odds ratios were farther from 1 for those who never responded and past responders compared to current responders. This indicates that those who never responded and past responders were more different from those who always responded (the reference group), and that current responders were more similar to those who always responded. Similarly, many of the ORs for the past responders were farther from 1 than those for the current responders. For example, the adjusted OR for deployment for never responders is 0.319 (0.286–0.356), compared with 0.583 (0.532–0.638) for past responders and 0.687 (0.624–0.755) for current responders. This suggests an ordering in which never responders were least similar to always responders, followed by past responders, with current responders most similar. (Supplementary Table 1 provides the number of veterans in each response group by predictor and Supplementary Table 2 provides crude odds ratios.)
Table 1. Adjusted odds ratios for never, past, and current responders relative to always respondersNever responderPast responderCurrent responderAORCIAORCIAORCIDeployment status Not deployed1.001.001.00 Deployed0.320.29–0.360.580.53–0.640.690.62–0.76Sex Male1.001.001.00 Female0.720.63–0.820.990.89–1.090.810.73–0.89Race/ethnicity White1.001.001.00 Black4.503.86–5.252.121.83–2.472.261.95–2.63 Hispanic1.571.22–2.011.210.97–1.501.160.91–1.48 Other/unknown2.601.91–3.532.151.62–2.861.781.33–2.37Age in 1991 17–251.001.001.00 26–320.520.45–0.600.650.57–0.750.720.63–0.83 33–390.230.18–0.280.370.31–0.430.520.44–0.6140 and older/unknown0.130.10–0.170.260.22–0.320.340.28–0.41 Marital status Married1.001.001.00 Single1.401.18–1.671.241.09–1.411.030.92–1.17 Other/unknown1.671.18–2.361.661.23–2.241.391.07–1.82Branch Army1.001.001.00 Air Force1.010.83–1.241.121.01–1.241.141.00–1.29 Marine Corps1.110.99–1.241.161.01–1.321.080.95–1.22 Navy1.551.30–1.861.291.11–1.491.100.96–1.26Unit component Active duty1.001.001.00 National Guard/ reserves1.131.03–1.241.070.99–1.160.980.91–1.07Rank in 1991 Enlisted1.001.001.00 Officer/warrant officer0.420.34–0.530.630.55–0.730.740.65–0.84AOR Odds ratio, CI 95% confidence interval, ORs are adjusted for all other covariates listed in the table
Wave 4 response by group
Response rates differed across all demographic and military service characteristics (Table 2). Higher response rates were associated with veterans who were deployed, older, male, white, in the Air Force, officers/warrant officers, and married in 1991. Results from the CART analysis (Table 3) revealed 20 groups with different response rates that were used to develop nonresponse adjustments to reduce the potential bias. The analysis indicates the potential for bias in the survey estimates due to differential representation of subgroups within the respondent sample.
Table 2. Wave 4 response rates by demographic and military service characteristicsTotal eligiblen (col %)Respondedn (row %)p-valueTotal26,46812,377 (46.8%)Deployment Status Deployed13,675 (51.7%)7,149 (52.3%)< 0.0001 Nondeployed12,793 (48.3%)5,228 (40.9%)Sex Female5,601 (21.2%)2,472 (44.1%)< 0.0001 Male20,867 (78.8%)9,905 (47.5%)Race/ethnicity Black6,007 (22.7%)2,339 (38.9%)< 0.0001 Hispanic1,242 (4.7%)522 (42.0%) White18,268 (69.0%)9,139 (50.0%) Other907 (3.4%)355 (39.1%) Unknown44 (0.2%)22 (50.0%)Age in 1991 17–2511,327 (42.8%)4,286 (37.8%)< 0.0001 26–327,601 (28.7%)3,552 (46.7%) 33–394,173 (15.8%)2,403 (57.6%) 40 and older3,342 (12.6%)2,124 (63.6%) Unknown25 (0.1%)12 (48.0%)Marital status in 1991 Married12,871 (48.6%)6,856 (53.3%)< 0.0001 Single12,375 (46.8%)4,888 (39.5%) Other1,191 (4.5%)617 (51.8%) Unknown31 (0.1%)16 (51.6%)Branch Air Force3,035 (11.5%)1,623 (53.5%)< 0.0001 Army16,801 (63.5%)7,822 (46.6%) Marine Corps3,111 (11.8%)1,363 (43.8%) Navy3,521 (13.3%)1,569 (44.6%)Unit component Active duty10,866 (41.1%)5,002 (46.0%)0.04 National Guard6,753 (25.5%)3,241 (48.0%) Reserves8,849 (33.4%)4,134 (46.7%)Rank in 1991 Enlisted22,873 (86.4%)10,142 (44.3%)< 0.0001 Officer3,322 (12.6%)2,045 (61.6%) Warrant273 (1.0%)190 (69.6%)
Table 3. Nonresponse adjustment cells and factorsGroup No.Variable 1Variable 2Variable 3Variable 4Variable 5Eligible sample sizeResponse rate1Age 17–25 (in 1991)Not deployed5,27331.32DeployedMarried (in 1991)Black, Hispanic, Other/Unknown41241.93WhiteAir Force10363.14Navy17046.25Army or Marine Corps81452.36Single, other, unknownOfficer9264.17Enlisted or Warrant4,47540.58Age 26–32Not deployedBlack or unknown94933.09Hispanic, other, whiteOfficer, WarrantOther, Single20243.610Married27662.411Enlisted2,23441.912DeployedBlack or other1,13344.613Hispanic, unknown, White2,82056.014Age 33+Black and otherDeployedMarine Corps, Navy12445.415Air Force, Army72559.216Not deployed90844.417Hispanic, unknown, whiteOther, SingleDeployed59864.218Not deployedAir Force, Marine Corps14554.119Army, Navy43745.020Married, unknown4,57866.2
Comparisons between the eligible sample and responders
In the comparison of eligibles versus responders with base weights, there were differences in marital status, deployment status, rank, and age (Table 4). This suggests that prior to nonresponse adjustment, the characteristics of responders differ from the full eligible sample after accounting for sample design. After applying final weights with nonresponse adjustment, all differences between the eligible sample estimates and final estimates were under 2% points (or under 0.2 for mean age). Bias was reduced for all but two estimates—other and unknown marital status. For these two estimates, the bias in the final estimates was close to zero.
Table 4. Comparison of estimates of demographic and military service characteristics with different weightsFrame variableValueEligible sample estimates(1)Base weighted estimates(2)p-value(2) vs. (1)NR adjustment weighted estimates(3)p-value(3) vs. (1)Final weighted estimates(4)p-value(4) vs. (1)% reduction in bias%Standard error%Standard error%Standard error%Standard errorBranch of serviceAir Force11.80.0913.40.32< 0.00112.30.270.0511.80.090.0496.5Army51.40.1352.20.500.0652.70.520.00851.50.150.0586.8Marine15.20.0814.40.310.00415.20.311.015.20.080.798.7Navy21.60.1120.00.50< 0.00119.80.55< 0.00121.40.16< 0.00188.8Type of serviceActive78.20.0878.50.280.278.70.290.0678.20.100.587.5Guard7.60.047.70.130.57.40.140.27.60.050.369.9Reserve14.20.0613.80.220.0513.90.240.214.20.080.896.2SexMale89.30.0590.30.19< 0.00189.50.240.589.30.060.899.4Female10.70.059.70.19< 0.00110.50.240.510.70.060.899.4Deployment StatusDeployed48.70.2653.70.50< 0.00149.10.300.00148.60.340.898.9Non-deployed51.30.2646.30.50< 0.00150.90.300.00151.40.340.898.9Rank in 1991Enlisted86.40.2981.60.51< 0.00185.30.38< 0.00185.20.38< 0.00175.5Officer12.40.2716.50.47< 0.00113.30.37< 0.00113.40.36< 0.00175.9Warrant1.20.101.80.19< 0.0011.40.140.0091.40.140.0272.7Age group in 199117–2545.60.4637.30.62< 0.00145.60.460.845.60.451.099.926–3229.40.3730.10.540.229.40.370.829.40.370.288.333–3916.00.3320.30.53< 0.00115.70.340.0315.70.340.0392.340 +9.00.1812.40.29< 0.0019.30.230.039.40.240.00888.0Age in 1991means28.20.0629.70.09< 0.00128.30.07< 0.00128.30.07< 0.00193.3Race/EthnicityBlack22.10.3018.10.46< 0.00120.70.44< 0.00120.50.44< 0.00160.8Hispanic4.90.194.60.240.094.70.260.44.70.260.335.5Other4.00.153.40.250.0023.80.240.23.80.250.260.2Unknown0.10.020.10.030.90.10.030.80.10.030.947.2White68.90.3473.70.47< 0.00170.80.42< 0.00170.90.42< 0.00159.2Marital status in 1991Married52.80.3860.20.60< 0.00153.80.600.0253.60.590.0789.8Other3.30.143.30.190.73.20.170.73.20.170.5-44.4Single43.90.3736.50.62< 0.00142.90.620.0343.20.610.191.1Unknown0.10.010.10.020.50.10.020.70.10.020.4-3.2
Analysis of health outcomes and sociodemographic characteristics measured in the survey
In general, the correlations between frame variables and health outcomes or sociodemographic characteristics are low to modest, with the strongest correlations for PTSD, GWI, and education (Table 5). The weighting adjustments are expected to be most effective in removing potential bias for variables with higher correlations. Low correlations indicate that the weighting adjustments have limited ability to reduce bias, should it be present.
Table 5. Proxy correlations between health outcomes and demographic and military service characteristics used in nonresponse adjustmentsOutcome or characteristicPseudo R-squaredProxy correlationGeneral health0.080.28PTSD0.110.33GWI0.100.32# days smoke0.010.11# cigarettes smoked0.010.11Alcohol use frequency0.030.17Alcohol use quantity0.040.21Alcohol/drug dependence0.030.18Bipolar disorder or manic depression0.030.16Education0.110.33Household income0.060.25
Although differences were observed in some health outcomes and sociodemographic characteristics before and after weighting adjustments, all differences are under 2% points (Table 6). The largest change occurred among respondents with a graduate or professional degree, with a base-weighted estimate of 23.0% compared to the final estimate of 21.3%. The change is in the expected direction, given the weighting adjustments correct for the over-representation of officers among responders.
Table 6. Estimates of health outcomes and sociodemographic characteristics (%) with different weightsOutcome or characteristicBase-weighted estimates(1)NR adjustment weighted estimates(2)p-value(2) vs. (1)Final Weighted Estimates (3)p-value(3) vs. (2)General healthExcellent3.93.7< 0.0013.70.001Very good18.618.1< 0.00118.20.06Good38.438.90.00138.90.8Fair31.531.80.0531.70.05Poor7.67.60.87.60.01PTSDYes29.630.7< 0.00130.3< 0.001GWIYes12.111.7< 0.00111.5< 0.001Alcohol or drug dependenceYes10.110.6< 0.00110.60.4Bipolar disorder or manic depressionYes6.26.5< 0.0016.50.3Alzheimer’s disease or dementiaYes1.71.5< 0.0011.50.1Smoking frequency0 days smoked89.489.20.0389.20.61 day smoked#0.10.7#0.32 days smoked0.20.20.050.20.43 to 7 days smoked0.40.40.80.40.78 to 14 days smoked0.40.40.030.40.215 to 29 days smoked1.51.60.061.60.730 days smoked8.18.10.58.10.2Smoking quantityDid not smoke in the past 3089.689.40.0389.40.51 or 2 cigarettes per day0.80.90.060.91.03 to 10 cigarettes per day4.04.20.064.20.711 to 20 cigarettes per day4.14.10.34.10.121 or more cigarettes per day1.41.41.01.40.5Alcohol use frequencyNever25.624.9< 0.00124.91.0Monthly or less25.225.60.0325.50.7Two to four times a month17.217.40.0217.50.2Two to three times a week14.915.10.0715.10.8Four or more times a week17.117.00.717.00.8Alcohol use quantityNone28.527.7< 0.00127.71.01 or 2 drinks47.146.60.00346.50.73 or 4 drinks15.716.4< 0.00116.40.25 or 6 drinks5.45.7< 0.0015.70.77 to 9 drinks1.82.0< 0.0012.00.510 or more drinks1.51.70.0011.60.07Educational attainmentHigh school or below14.915.5< 0.00115.50.5Some college or associates41.942.8< 0.00142.70.2Bachelor’s degree20.320.50.120.50.9Graduate or professional degree23.021.3< 0.00121.30.5Household income0–34,9999.69.80.019.90.235,000–49,99910.110.00.0410.00.450,000–74,99920.120.00.520.00.875,000–99,99918.418.20.0518.10.2100,000+41.842.00.242.10.4(#) Rounds to zero
Analysis of early versus late responders
Several estimates of physical and mental health differed between the early and late responders (Table 7). This analysis assumes that late responders are more similar to nonrespondents than early responders; thus, differences may indicate potential nonresponse bias. If late responders and nonrespondents are different, using late responders as a proxy for nonrespondents may be unreliable. Late responders were more likely than early responders to report alcohol or drug dependence, daily smoking, smoking more than half a pack per day, and having seven or more drinks per day. Odds ratios for poor health, bipolar disorder/manic depression, Alzheimer’s disease/dementia, and drinking for or more times per week were also elevated. (Supplementary Table 3 provides the number of early and late responders by each health outcome and sociodemographic characteristic.)
Table 7. Adjusted odds ratios for late versus early responders on health outcomes and sociodemographic characteristicsHealth Outcome or Sociodemographic CharacteristicsAOR (Late versus early)CIp-valueGeneral health statusPoor1.640.98–2.720.06Fair to excellent1.00PTSDYes1.020.83–1.250.8No1.00GWIYes0.820.61–1.110.2No1.00Past month smoking frequencyDaily1.871.41–2.47< 0.0001Less than daily1.00Past month smoking quantity10 or more cigarettes daily1.791.25–2.570.002Fewer than 10 cigarettes daily1.00Past year alcohol frequency4 or more times per week1.250.99–1.580.06Fewer than 4 times per week1.00Drinks per occasion7 or more drinks1.691.05–2.370.03Fewer than 7 drinks1.00Alcohol/drug dependenceYes1.541.13–2.100.007No1.00Bipolar or manic depressionYes1.440.99–2.110.06No1.00Alzheimer’s disease or dementiaYes1.580.83–3.010.2No1.00EducationGraduate or professional degree0.590.42–0.820.002Bachelor’s degree or lower1.00Household income100,0001.00ORs adjusted for age, sex, rank, unit component, deployment status, branch, race/ethnicity, and marital status in 1991AOR Adjusted odds ratio, CI 95% confidence interval
Discussion
GWECS is the largest and longest-running longitudinal cohort study of Gulf War veterans. It has contributed much of what is known about the health effects of deployment to the Gulf and has resulted in more than 30 peer-reviewed publications. Wave 4 obtained a response rate of 47%, which is very close to the 50% response rate obtained in Wave 3 over a decade earlier. As such, Wave 4 was a success.
Wave 4 of the study encountered a differential response similar to that observed in other longitudinal studies of the general population and veterans. Three groups of veterans with a history of nonresponse differed from those who responded to all four waves of the study. Many of the factors associated with never responding were also associated with inconsistent response across survey waves including non-deployed, enlisted, younger age, Black/Hispanic/other race/ethnicity, and single/other marital status in 1991. However, nonresponse was more a matter of degree than kind, with those who never responded being the most different on the characteristics examined compared to those who always responded followed by past responders and current responders.
Weighting adjustments effectively accounted for differences and reduced bias in demographic and military characteristics. Estimates of health outcomes were similar before and after weighting adjustment. Given the modest correlations between health outcomes and the demographic and military characteristics used in weighting, weighting is likely to have had limited impact on possible bias in health outcomes. Because there are no benchmarks for health outcomes for a unique survey like GWECS, we compared several health outcomes between early and late responders. Estimates of alcohol or drug dependence, as well as alcohol use and smoking, were higher among the late responders. Assuming that late responders are similar to nonresponders, this could indicate possible bias in these estimates.
Several limitations to the analysis should be noted. First, reasons for nonresponse were not distinguished, such as inability to contact (e.g., did not receive the mailed invitation because of a change of address) or refusal (e.g., received the mailed invitation but chose not to respond). Veterans who refused to participate may have different demographic and military service characteristics than those who could not be contacted. Second, individual response patterns related to survey responses on physical and mental health at Wave 1 could not be examined because not all veterans completed the Wave 1 survey. Future analyses should focus on those who participated in Wave 1 and model their future participation typologies based on baseline health indicators. In addition, because physical and mental health are likely to change over time, examining how these factors at each wave impact subsequent response could yield important insights. Finally, possible bias in self-reported health outcomes may make it difficult to disentangle the effects of nonresponse bias from measurement error.
The findings suggest the importance of continuing to exert effort into maintaining participation among veterans whose characteristics are associated with nonresponse. One of the factors most strongly associated with response over time was being deployed. This may be because deployed veterans have greater health concerns, which could increase their interest in participating in an epidemiological study. Deployment may also have contributed to stronger identification with the cohort, further encouraging participation. For those who were not deployed, it will be important to continue to emphasize why participation matters for the study as a whole. Future cohort studies may also wish to conduct targeted oversampling of groups that are likely to be underrepresented. While nonresponse adjustments can mitigate bias, they are not a replacement for collected data, especially when performing stratified analysis. GWECS and other epidemiological studies of veterans will need to continue to focus on efforts to increase participation among underrepresented groups.
It has now been over 30 years since the 1990–1991 Gulf War, yet many concerns and unknown questions remain about the long-term health effects of service in the Gulf War. The aging of the Gulf War veteran population confounds many of these issues, as almost all cancers and chronic diseases are associated with aging making it difficult to untangle age-related morbidity from deployment-related exposures. There are many questions surrounding the health of women Gulf War veterans, an understudied population, as they age. As the only study following a population-based cohort that includes a 20% sample of women Gulf War and Gulf War Era veterans, GWECS has contributed greatly to the literature in this area [21]. Overall, GWECS is an invaluable resource for the study of the health effects of deployment-related toxic exposures, the health effects of military service, and the gender specific health effects.
Conclusion
Despite declining survey response rates, [22] GWECS successfully maintained its response rate after a decade. Findings on the demographic and military service characteristics associated with response patterns over time provide useful information on groups that may require additional effort and resources to encourage survey completion. While the use of nonresponse weights reduced bias in demographic and military service characteristics, the analysis showed that late responders were more likely than early responders to report alcohol or drug dependence, daily smoking, smoking more than half a pack per day, and having seven or more drinks per day. These results are consistent with other longitudinal surveys where nonresponders and late responders tend to engage in less healthy lifestyles [23] and riskier health behaviors than responders. This includes an increased risk of alcohol-, smoking-, and drug-related mortality and morbidity [24]. Assuming that late GWECS responders are more similar to nonresponders than responders, these findings suggest that even with the nonresponse adjustment, it is possible that the levels of alcohol and drug dependence may be underestimated. These findings may be useful to other studies of veterans as well as general population studies.
Supplementary Information
Supplementary Material 1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Wang J, Bartlett M, Ryan L. On the impact of nonresponse in logistic regression: application to the 45 and Up study. BMC Med Res Methodol. 2017;17(1):80. 10.1186/s 12874-017-0355-z.10.1186/s 12874-017-0355-z PMC 542289228482809 · doi ↗ · pubmed ↗
- 2Sharp M, Franchini S, Jones R, Wesserly S, Stevelink S, Fear N. Health and wellbeing study of serving and ex-serving UK armed forces personnel: phase 4.s September 2024. https://kcmhr.org/pdf/Phase_4_Health_and_Wellbeing_Cohort_Study_Report.pdf.
- 3Dursa EK, Cao G, Culpepper WJ, Schneiderman A. Comparison of health outcomes over time among women 1990–1991 Gulf war Veterans, women 1990–1991 Gulf era Veterans, and women in the U.S. General population. Women’s health issues: official publication Jacobs Inst Women’s Health Jul. 2023;24. 10.1016/j.whi.2023.06.006.10.1016/j.whi.2023.06.00637495424 · doi ↗ · pubmed ↗
- 4Christensen AI, Ekholm O, Gray L, Glümer C, Juel K. Sep. What is wrong with non-respondents? Alcohol-, drug- and smoking-related mortality and morbidity in a 12-year follow-up study of respondents and non-respondents in the Danish Health and Morbidity Survey. Addiction (Abingdon, England). 2015;110(9):1505-12. 10.1111/add.12939.10.1111/add.12939 PMC 453879325845815 · doi ↗ · pubmed ↗
