Original Research
Missing data on the Center for Epidemiologic Studies Depression Scale: A comparison of 4 imputation techniques

https://doi.org/10.1016/j.sapharm.2006.04.001Get rights and content

Abstract

Background

Missing data are widespread in the medical sciences. Given their prevalence, researchers must be prepared to address problems that arise when data are missing.

Objectives

The objectives were to (1) provide an estimate of bias for each imputation technique with known values from data engineered to be missing completely at random; (2) determine whether different Center for Epidemiologic Studies Depression (CES-D) Scale scores were obtained from item-mean, person-mean, regression, and hot-deck imputation techniques and whether they differed from the CES-D score obtained from complete cases; and (3) determine whether the variables that predicted the CES-D scores were the same for the complete cases and each of the 4 imputation techniques.

Methods

Depressive symptoms were assessed in patients (N = 2,317) in an international clinical trial comparing high blood pressure treatments between April 1, 1999, and October 31, 1999. Patients were mailed surveys after randomization. Depressive symptoms were measured using the CES-D Scale. Respondents who completed all 20 items were compared with those who did not complete all 20 items, using independent t tests and chi-square. Z scores were used to determine CES-D mean differences, and multiple regression models were used to predict the CES-D scores for the 4 imputation techniques and the complete case data.

Results

Imputed CES-D mean scores ranged from 14.58 to 14.68. The 4 imputed CES-D mean scores were consistently, but not significantly, higher than the complete case CES-D mean of 14.06. Imputed mean scores were similar to each other and the complete case mean score. Four regression models predicting the imputed CES-D scores yielded similar predictions. With the exception of sex, the same variables predicted the complete case CES-D and the imputed CES-D scores.

Conclusions

All the imputed means were similar to the complete case mean, with the exception of the regression imputation. Imputing missing data did not significantly alter the conclusions regarding which factors were associated with variations in CES-D scores. Since imputation has the potential to increase statistical power, researchers dealing with missing CES-D scores should consider imputing missing data.

Introduction

One area of the social and medical sciences with significant patient reported outcome implications is the area of depression. However, just as with the rest of the medical and social sciences, missing data are widespread in studies that include patients' self-reports of depression or depression symptoms. In the worst case, authors of published studies may not even mention if they encountered missing data. For example, the Center for Epidemiologic Studies Depression (CES-D) Scale1 is a widely used indicator of depression in population-based studies. Authors of 10 of 25 recent, randomly selected articles using the CES-D did not mention missing data (the list of articles is available from the author upon request). In other cases, the authors simply mentioned that data were missing and then conducted the analysis using only the complete cases. Complete case analysis, whereby only cases with complete data are included in the analysis, has its own concerns and limitations.2 In the case of the CES-D, authors of 11 of the 25 articles noted missing data, but used only the complete cases in their analyses with no assessment of potential biases. In other words, 85% of recent studies using the CES-D for population-based estimations and clinical outcome studies either did not mention if data were missing or eliminated all persons with missing data, potentially introducing bias and influencing the results. Given the prevalence of missing data, researchers must be better prepared to address problems that arise when data are missing.

This investigation had 2 purposes. The primary purpose was to evaluate the results of the use of 4 single-value imputation strategies for the measurement of depressive symptoms as measured by the CES-D Scale and study the impact of imputing missing data on the conclusions. Information regarding the similarities and differences among these imputation techniques in the conclusions drawn will be useful to researchers who are struggling with missing CES-D data and will provide them with a starting point for conducting their own systematic comparison. It may provide them with some confidence regarding their own study's interpretations when data are missing or encourage them to implement a strategy to ameliorate the power and bias consequences of missing data.3 However, before embarking on the primary purpose, a secondary purpose was to provide a brief rationale and a review of the methods used to impute the missing data in this study. A number of articles and textbooks are available for a complete review of the issue of missing data.2, 4, 5, 6, 7, 8, 9 We focused on the rationale and a brief review on the issues we faced in making decisions regarding whether to impute missing data in our own work with the CES-D in this investigation.10, 11

A plethora of issues must be considered when making decisions regarding whether missing data should be imputed and the techniques that should be used. As mentioned earlier, this article cannot completely review these issues. Some of the more important issues include single versus multiple imputation, normal versus nonnormal data issues (eg, categorical data), models that include mixed levels of measurement (ie, ordinal, ratio), and nonparametric techniques (eg, hot deck) versus parametric techniques (eg, multiple imputation). The reader is directed to more complete discussions of these issues in published authoritative resources.2, 4, 5, 6, 7, 8, 9

However, in all studies the first consideration in determining whether imputation is appropriate is to examine the source and reason for the missing data.2 In some cases, the reason for missing data from a study subject is that he or she did not return the survey. This situation is sometimes referred to as missing “units.” This article does not examine strategies for imputing or assessing the reasons or effect of missing unit nonresponse. In other instances, the data collection instrument may have been returned but may be missing specific items within the survey. Accurately ascertaining the reasons the data are missing is important to any decision regarding whether to impute data, regardless of whether the data involve missing units or items. In either case, the primary issue is whether the nonresponse is “ignorable.”

An item or questionnaire is said to be missing completely at random (MCAR) if the missing assessments are independent of all previous, current, and future assessments had they been observed.12 Thus, when an item or questionnaire is MCAR, cases with complete data are indistinguishable from cases with incomplete data.13 Missing data due to accidental death, the respondent moving out of town, or staff forgetting to administer the instrument due to a random incident may yield an item or questionnaire MCAR.14

Data are termed missing at random (MAR) if the missing values of the dependent variable depend on the independent variable(s) but do not depend on the dependent variable.15 Thus, when an item or questionnaire is MAR, cases with incomplete data are different from cases with complete data, but the pattern of missing data is traceable or predictable from other variables in the database rather than being due to the specific variable on which the data are missing.4 The cause of the missing data is some external influence. For MAR, the probability of having a missing questionnaire may depend on previous scores, but must be independent of both current and future scores.2, 16 For example, if patients with a poorer quality-of-life score at the previous assessment are more likely to have a missing questionnaire at the current assessment, then the missing questionnaire is not MCAR, but instead is MAR.2, 16 MAR is a less stringent assumption than MCAR.

An item or questionnaire is said to be missing not at random (MNAR) if the missing values are not predictable from other variables in the database but are predictable from the variable on which the data are missing.2, 16 For example, if a participant in a depression study does not return a questionnaire because of increased feelings of depression, these data are MNAR. Thus, data are MNAR if the data missingness is explainable and only explainable by the very variables on which the data are missing. Bias associated with the missing unit or items is “nonignorable” in this situation. Whenever the probability of dropout depends on at least one unobserved score, then the process is termed MNAR.2, 16 With MNAR the probability of having a missing questionnaire depends on scores in current and future unobserved assessments.2, 16 With MNAR, the missingness may be influenced by values on the missing variable and on its relationships with other variables. In this case, strategies are available to assess the degree of bias, including instrumental variables,17, 18 propensity score calibration,19, 20 and Heckman's correction.21 Examples of data that are MNAR in a typical clinical trial of a pharmaceutical agent are data associated with increased toxicity, progressive disease, or death.14

Next, after determining the reason for the missing data and whether it is ignorable or nonignorable, the decision regarding whether or not to impute missing data and what strategy to use to impute the data should be examined (Fig. 1).6 Until recently, the complete case approach to analyzing the data was the standard, whether the reasons for missing data were ignorable or not. The complete case approach discards all cases with unit or item nonresponse, and the analysis is conducted using only the complete cases whether partial information is available or not. For example, in the case of the CES-D, all of an individual subject's responses would be discarded if the response to one item was missing, even though the subject had completed the other 19 items of the scale. However, as noted earlier, the complete case approach requires the missing data to be MCAR to yield unbiased results. The complete case approach is best suited to situations where the extent of missing data is small, the sample size is large enough to allow for the deletion of the cases with missing data without severely jeopardizing statistical power, and the relationships in the data are strong so as not to be biased by any missing data process.12, 15 The complete case approach is simple2, 16 and requires little effort from the researcher because it is the default in most statistical software packages. A smaller sample size is the most apparent disadvantage. With the complete case approach, the remaining sample size may be inadequate for conducting statistical analyses with sufficient power. Another disadvantage of the complete case approach is the loss of other information for an individual respondent on all other variables when incomplete cases are discarded due to item nonresponse on a single variable.2, 16 Finally, unless the data are MCAR (ie, a random subsample of all cases), discarding incomplete cases will result in a biased sample, such as possible systematic differences between the complete cases and the incomplete cases.22 If the data are not MCAR, then other means of handling the missing data should be used, including data imputation.

Various single-value imputation approaches, mixed effects and pattern mixture models, and multiple imputation methods are available to the researcher (Fig. 1).6 Imputation is the process of estimating the missing value(s) based on valid values of other variables and/or cases in the sample.15 The objective of imputation is to use known relationships that can be identified in the valid values of the sample to assist in estimating the values that are missing.15 The complete case approach (listwise deletion) removes the problem of missing data by removing these observations, whereas the imputation approach removes the problem of missing data by filling in values for the unknown or missing data.16 Data missing at random, either MCAR or MAR, are more amenable to imputation techniques.23 Standard analytic methods can be used once missing values have been imputed.

A variety of imputation techniques have been proposed in the literature. Some techniques are mathematically and computationally difficult to apply because missing values are estimated based on valid values of other variables and/or cases in the sample.15 Methods of imputation include item-mean imputation, person-mean imputation, cold-deck imputation, hot-deck imputation, last value carried forward imputation, worst value imputation, best value imputation, random imputation, regression imputation, Monte Carlo and other stochastic methods, and multiple imputation plus their combinations.

Although a number of single-value and multiple imputation strategies are available,6 for this investigation we chose to focus on 4 single-value imputation techniques, including item-mean, person-mean, regression, and hot-deck imputation, because we determined the data to be MAR.24 These 4 imputation techniques vary in terms of complexity, but all are commonly used and were used in the present study for a number of reasons. Item-mean imputation was chosen because it is the most commonly used imputation technique for missing data on the CES-D and is included in most statistical software. Person-mean imputation was chosen because it is the only other imputation technique used with missing data on the CES-D to date. Regression imputation was chosen because this technique has not been used with missing CES-D data, and it purposefully uses existing relationships to predict the missing value, whereas the other 3 imputation techniques chosen do not. Finally, metric-matching hot-deck imputation was chosen because it is used with ignorable nonresponse (MAR) and when predictor variables are both categorical and metric.5

Item-mean imputation replaces an individual's missing values on an item with the mean for that item calculated by using the scores of all study respondents who completed that item.23 An advantage of item-mean imputation is its simplicity with respect to its application.15 Very little programming is required to use this imputation technique. Compared with other imputation methods, its major disadvantage is that the actual distribution of values is distorted by substituting the mean for all missing values, thereby causing the variance to be artificially reduced.3, 16 Another disadvantage is that it depresses the observed correlations because all missing values are replaced by a constant.15 Given these disadvantages, item-mean imputation may be prone to producing biased estimates of the missing values.7

Person-mean imputation requires substitution of the mean of all of an individual's completed items for those items that were not completed on a given scale.23 This differs from item-mean where the mean response of the whole sample that responded to the item is substituted. Person-mean imputation could result in different substitutions for each person with missing items. On the plus side, because it does not substitute a constant value, it does not artificially reduce the measure's variability and is less likely to attenuate the correlation. A disadvantage is that it tends to inflate the reliability estimates as the number of missing items increases.23 However, when the numbers of either respondents with missing items or items missing within scales are 20% or less, both item-mean imputation and person-mean imputation provide good estimates of the reliability of measures.23

Regression imputation substitutes a predicted value based on the regression of other variables on the missing variable from individuals with complete data. It assumes that other variables selected from the data are related to the missing variables.23 For example, when applying regression imputation to a missing value in a depression scale, the depression item is regressed on determinants of depression such as age and sex. Its major advantage is an unbiased point estimate of the missing value. The resulting rectangular data set is larger, and the bias in the depression score variable is reduced because the multivariate distribution of known variables is used in determining estimates for missing values.22 Although regression imputation has the appeal of using relationships already existing in the sample as the basis of prediction, it also has disadvantages. First, it reinforces already existing relationships in the data; thus, the resulting data become more characteristic of the sample and less generalizable to the population. Second, regression imputation assumes that the variable with missing data has substantial correlations with the other variables, an assumption that may or may not hold true.

Finally, hot-deck imputation selects a value from similar respondents with completed items, usually within the same data (vs cold-deck, which uses data from another source using the same item), and substitutes the selected value for the respondent's missing value.16 The “deck” refers to responses from those with complete items from which the researcher may select a value.16, 25 The value may be selected simply at random or by using an elaborate scheme, such as selecting a value from only those respondents who have similar characteristics, such as sex, age, or treatment group.25 For example, if a 35-year-old woman was missing a data point on item 1, a matrix of values of all 35-year-old female respondents with completed data would be created. A random value would be selected from the matrix of values, and that value would be substituted for the missing item. One advantage of the hot-deck method is its conceptual simplicity. Another advantage is that it maintains the proper measurement level of variables. For example, item-mean imputation and person-mean imputation substitute an “average” for missing nominal variables, such as sex or race. Among several disadvantages of the hot-deck method is the extensive programming required to implement this technique if it is not randomly selected from the deck because it requires customized syntax to perform the selection of values from similar respondents. Another disadvantage is the ambiguity surrounding the definition of “similar,” which may vary from one researcher to another, creating uncertainty around the results. Carefully articulated definitions of “similar” substantially reduce the ambiguity. In our study, hot-deck values were obtained by randomly selecting a single value from a matrix of persons with similar health status, sex, education, race, and age and then substituting that value for the missing value of the respondent in question.

We acknowledge the availability of numerous methods of imputing data, including single-value and multiple imputation techniques. However, as our primary purpose was to compare (1) various imputation methods with the complete case analysis and (2) a single-value imputation method (person-mean) used in the CES-D literature with other alternatives, we chose to compare 4 single-value item imputation models with each other and with the complete case approach for data we considered to be MAR.24 We chose not to conduct multiple imputations or investigate potential biases because we considered the data to be MAR and the reasons for missing data to be ignorable. Moreover, we chose to heed the advice of Rubin,26 who stated that “my own assessment is that unless a user has the resources … the iterative versions of software for creating multiple imputations are not yet ready for reliable applications by the typical user.” The specific objectives of this investigation were to:

  • (1)

    provide an estimate of bias for each imputation technique with known values from data engineered to be missing completely at random;

  • (2)

    determine whether different CES-D scores were obtained from item-mean, person-mean, regression, and hot-deck imputation techniques and whether they differed from the CES-D score obtained from complete cases; and

  • (3)

    determine whether the variables that predicted the CES-D scores were the same when using complete cases and each of the 4 imputation techniques.

Section snippets

Study design

The Study of Antihypertensive Drugs and Depressive Symptoms (SADD-Sx)10, 11 is a substudy of INVEST.27, 28 Briefly, INVEST was a randomized, open-label, blinded endpoint study of 22,576 patients with hypertension and coronary artery disease (CAD) aged ≥50 years conducted from September 1997 to February 2003. Patients were randomized to antihypertensive treatment with either a verapamil SR- or atenolol-based strategy to achieve blood pressure (BP) control according to the sixth report of the

Results

A total of 1,635 respondents returned the questionnaire, representing 71% of the SADD-Sx enrollees. Eighty-three percent of the respondents returning a questionnaire completed all 20 CES-D items; questionnaires returned by 14% of the respondents had between 1 and 4 CES-D items missing; and those returned by 3% had 5 or more CES-D items missing (see Table 1).

Discussion

Demographically, respondents completing at least 16 CES-D items were similar to those completing 15 or fewer CES-D items, with the exception of age. Respondents who completed 15 or fewer CES-D items were generally older. Because there was a significant demographic difference, it is likely that the missing CES-D data are not MCAR, but are more likely to be MAR. When the demographic characteristics of respondents completing all 20 CES-D items were compared with the demographic characteristics of

Conclusions

This investigation showed that the 4 imputation techniques yielded similar CES-D mean scores that did not differ significantly from the complete case CES-D mean. It also demonstrated that the predictors of CES-D score were similar regardless of imputation technique used and, with the exception of sex, were the same as the predictors of complete case CES-D score. Unfortunately, there is no one universal solution to the problem of missing data. The results from this investigation indicate that

References (58)

  • J.L. Schafer

    Multiple imputation: a primer

    Stat Methods Med Res

    (1999)
  • L.D. Ried et al.

    Antihypertensive drugs and depressive symptoms (SADD-Sx) in patients treated with a calcium antagonist-based or β-blocker–based hypertension strategy in the INternational Verapamil SR-Trandolapril STudy (INVEST)

    Psychosom Med

    (2005)
  • L.D. Ried et al.

    Comparing depressive symptoms of patients at high risk for depression after antihypertensive treatment with verapamil-led versus atenolol-led strategies

    Ann Pharmacother

    (2006)
  • A.B. Troxel et al.

    Statistical analysis of quality of life with missing data in cancer clinical trials

    Stat Med

    (1998)
  • D.F. Heitjan

    Annotation: what can be done about missing data? approaches to imputation

    Am J Public Health

    (1997)
  • D.L. Fairclough et al.

    Why are missing quality of life data a problem in clinical trials of cancer therapy?

    Stat Med

    (1998)
  • J.E. Hair et al.

    Multivariate Data Analysis

    (1998)
  • D. Curran et al.

    Incomplete quality of life data in randomized trials: missing forms

    Stat Med

    (1998)
  • R.J. Bowden et al.

    Instrumental Variables

    (1984)
  • J.D. Angrist et al.

    Identification of causal effects using instrumental variables

    J Am Stat Assoc

    (1996)
  • T. Sturmer et al.

    Correcting effect estimates for unmeasured confounding in cohort studies with validation studies using propensity score calibration

    Am J Epidemiol

    (2005)
  • B. Rosner et al.

    Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error

    Am J Epidemiol

    (1990)
  • J. Heckman

    Sample selection bias as a specification error

    Econometrica

    (1979)
  • J.C. Whitehead

    Item nonresponse in contingent valuation: should CV researchers impute values for missing independent variables?

    J Leisure Res

    (1994)
  • R.G. Downey et al.

    Missing data in Likert ratings: a comparison of replacement methods

    J Gen Psychol

    (1998)
  • Bono CB. Missing Data on the Center for Epidemiologic Studies Depression (CES-D) Scale: A Comparison of Four Imputation...
  • B.L. Ford

    An overview of hot-deck procedures

  • D.B. Rubin

    Software for multiple imputation

  • C.J. Pepine et al.

    A calcium antagonist vs a non-calcium antagonist hypertension treatment strategy for patients with coronary artery disease. The International Verapamil-Trandolapril Study (INVEST): a randomized controlled trial

    JAMA

    (2003)
  • Cited by (124)

    • Association between informal employment and depressive symptoms in 11 urban cities in Latin America

      2022, SSM - Population Health
      Citation Excerpt :

      Of the 7979 who reported being employed, we further excluded those with no information on type of employment (N = 221) and those missing >5 questions on the CESD-10 (N = 1720). For individuals with ≤ 5 missing CESD-10 questions (n = 858), we imputed missing values using mean imputation (Bono et al., 2007; Rush Alzheimer's Disease Center (RADC) Research Resource Sharing Hub, n.d.). We also excluded those with missing demographic covariates (age, gender, education) and household composition covariates (relationship status, household size, children in the household under 5 years old) (N = 80), and those missing data on other employment characteristics (e.g., occupation) (N = 528).

    View all citing articles on Scopus

    Supported with resources and the use of facilities at the Malcom Randall Veterans Affairs Medical Center in Gainesville, Fla. We also acknowledge partial funding by Knoll Pharmaceuticals (now Abbott Laboratories) and the University of Florida.

    View full text