Beck Depression Inventory-II (BDI-II) is a 21-item clinician administered and scored scale that is designed to measure a person's mood and symptoms related to depression. The BDI-II was designed to conform to the DSM-IV depression diagnostic criteria and represents a substantial improvement over its predecessor, the original Beck Depression Inventory. The BDI-II has been used both as a research measure (its primary intended use) and to assist with the clinical diagnosis of depression. The BDI-II has been subject to numerous empirical studies designed to measure its internal consistency, convergent and discriminant validity, criterion validity, and construct validity and the test demonstrates acceptable psychometric qualities, but there have been some concerns with its use. This paper reviews the development of the BDI-II, its psychometric properties, uses, strengths, and weaknesses. Advantages and disadvantages of using the BDI-II and recommendations for future research regarding its use are also discussed.

The psychiatric diagnosis of major depressive disorder characteristically begins with the identification of symptoms of which the presence and/or severity that occur over a specific span of time are evaluated. Standardized measures of depression that can be used to measure and document mood, somatic, vegetative, and other depressive symptoms can be useful in the diagnosis of depression and also to determine the severity of the symptoms in the person. The Beck Depression Inventory (BDI) is a 21-item clinician administered and scored scale designed to measure mood and symptoms related to depression that although developed originally for research purposes in 1961, also enjoyed widespread clinical use (Arbisi, 2001). Following nearly 35 years of clinical and research use the BDI underwent a major revision. The revised version of the Beck, the BDI-II was developed in 1996. This paper reviews the development of the BDI-II, its psychometric properties, uses, strengths, and weaknesses.

Basic Description and Test Development

Compared to the original BDI, the BDI-II added items covering such aspects of depression as agitation, worthlessness, concentration difficulties, and loss of energy. Consequently, items were dropped or revised regarding the domains of weight loss, body image change, somatic preoccupation, and work difficulty while still retaining the BDI's 21-item format. The revision was substantial as all but three of the original items were change (Arbisi, 2001). (Grothe et al., 2005) reports that the revisions of the BDI to the BDI-II were undertaken to make the test correspond more closely to the diagnostic criteria for mood disorders in the DSM-IV by, designing it to correspond to the items of the SCID (Structured Clinical Interview for DSM Disorders) for the DSM-III-R. Furthermore some items on the original BDI had some issues with clinically failing to differentiate across the range of depression (e.g., mild, moderate, and severe presentations of depression) and several other items were found to display a gender bias (Arbisi, 2001). In fact, a revision of the original BDI had been developed in 1987 (the BDI-IA) that reworded 15 of the original items 21 items but yet this version still did not address some of the aforementioned issues with the original BDI such as its limited range of depressive symptoms nor its failure to be consistent with DSM diagnostic standards and criteria for mood disorders (Arbisi, 2001).

The BDI-II consists of 21 items read by the subject (or alternatively they can be read to the subject by the administrator). Each item is followed by four options (statements) that the respondent is required to endorse as they are related to their feelings over the prior two weeks including the day of the assessment. The options are scored zero to three, with higher scores reflecting more severe levels of depressive symptomatology. The items reflect different dimensions of depression ranging from sadness to loss of energy to loss of interest in activities such as sex. A test question example is provided below (Beck, Steer, & Brown, 1996):

1. Sadness

0. I do not feel sad.

1. I feel sad much of the time.

2. I am sad all the time.

3. I am so sad or unhappy that I can't stand it.

The time to administer the test typically ranges from five to ten minutes (Arbisi, 2001). The test is designed for and can be administered to individuals 13-86 years of age, provided they are not illiterate (in which case the test can be read aloud to the subject). However it has been used for younger and older subjects as well (Arbisi, 1996). Following completion of the test the administrator sums
up the individual item totals and compares them to standardized cut scores to determine the severity of depression in the individual.

The cut scores for the BDI-II were originally developed by classifying 127 University of Pennsylvania outpatients into four groups: mildly depressed (2) moderately depressed (3) severely depressed and (4) nondepressed based on the SCID for the DSM-III. Cut scores were derived through the use of receiver operator characteristic curves (Beck et al., 1996). The manual for the BDI-II does not provide demographic information for the standardization sample, a potential shortcoming. The BDI-II has been translated into numerous languages and there are computerized and internet versions available (Arbisi, 2001).

Psychometric Properties


The BDI-II has demonstrated sound reliability across many different empirically-based studies. According to the BDI-II manual item-total correlations ranged from .38 to .74 (Beck et al., 1996). According to Arbisi (2001) the internal consistency (Cronbach's alpha) was .92 and .93 respectively for a clinical sample of 500 outpatient therapy patients (91% Caucasian) and a sample of 120 Canadian college students (described as "predominantly" Caucasian). Test-retest over a one-week period was assessed in a very small subsample of 26 of the outpatients and was shown to be high (r = .93). Over short intervals test-retest reliabilities have been adequate to high ranging from the .70's to .93; however, over longer periods we would not expect this to occur due to the waxing and waning nature of depression (Beck et al., 1996).

Hollandare, Andersson, and Engstrom (2010) recruited 87 patients from primary care and psychiatric care in a Swedish public health care system. The participants completed the BDI-II and the Montgomery-Asberg Depression Rating Scale-Self-rated (MADRS-S) on paper versions and versions on the Internet. The order of administration was randomized in order to control for order effects. The depressive symptom severity in the sample ranged from mild to severe. Cronbach's alpha ranged from good to excellent for the BDI-II regardless of the order of administration (? =.91 for paper administration in the paper-first group; = .87 for internet first group on the internet version and; =.89 for both paper first group completing on the internet version and the internet first group on the paper version). Thus, it appears that the BDI-II has stable internal consistency whether it is administered on paper or on the computer.

Segal, Coolidge, Cahill, and O'Riley (2008) studied the psychometric properties of the BDI-II using a sample of 376 community-dwelling adults with an age range between 17-90 years old. For the entire sample the BDI-II had excellent internal reliability as measured by Cronbach's alpha (? = .90). Alpha was also calculated for young and older adult groups. For the young adult group (17-29 years old) the internal reliability was found to be excellent (? = .92) and for the older adult group (55-90 years old) the internal reliability of was also good (? = .86).

The participants in the aforementioned studies regarding the reliability of the BDI-II were predominately Caucasians. There have been other studies using other ethnic groups as participants that have investigated the reliability of the BDI-II. For example, Grothe et al. (2005) investigated the psychometric properties of the BDI-II using 200 African-American participants with mean age of 49.26 years old (range of 20 to 90 years old). Internal consistency as measured by Cronbach's alpha was consistent with previous findings in Caucasian samples (BDI -- II total score, ? =.90). These researchers also performed a confirmatory factor analysis on the BDI-II to test the fit of a two-factor model of the BDI-II proposed by other researchers (e.g., Beck et al., 1996). The internal consistency for both factors was also good (Cognitive factor ? = .81; Somatic factor, ? =.87).

VanVoorhis and Blumentritt (2008) investigated the psychometric properties of the BDI-II in a sample of 131 Mexican-American youths recruited from three facilities in the Southern Texas border area. Ages of the participants ranged from 13 to19 years, with a mean age of 15.5 years. Using coefficient alpha the internal consistency of the BDI-II in this sample was consistent with levels found in the aforementioned studies (? = .90).


Arbisi (2001) indicated that the BDI-II has good convergent validity reporting that the strong correlations between the BDI-II and BDI-IA (.93), the Beck Hopelessness Scale (.68), the Revised Hamilton Psychiatric Rating Scale for Depression (.71), and the Symptom Checklist-90-Revised (SCL-90-R) Depression subscale (r = .89).

With respect to discriminant validity Arbisi (2001) reported that the correlation between the BDI-II and the Revised Hamilton Anxiety Rating Scale was .47, which is was significantly higher than the…

