Research Paper Undergraduate 3,052 words

BRFSS Data Analysis: Predictors of Poor Physical Health

~16 min read

Abstract

This paper presents a comprehensive statistical analysis of 2019 Behavioral Risk Factor Surveillance System (BRFSS) data to identify characteristics that predict self-reported poor physical health. Using SPSS, the analysis applies independent samples t-tests, chi-square tests of independence, Pearson correlations, binary logistic regression, and multiple linear regression. Key variables examined include adverse childhood experiences (ACES), veteran status, sex, income, employment, education, smoking, exercise, alcohol use, BMI, and age. Results indicate that ACES score, employment status, income, exercise, age, and BMI are among the strongest predictors of both the likelihood and frequency of poor physical health days, while veteran status shows no significant association with either outcome.

Key Takeaways

Graphical Description of Physical Health Days: Histogram and dot plot describe PHYSHLTH_DAYS distribution
Group Differences in ACES Score and Alcohol Use: T-tests compare ACES and alcohol by health status
Association Between Physical Health Status and Veteran Status: Chi-square tests association between health and veteran status
Correlation Between Physical Health Days, ACES Score, and Veteran Status: Pearson correlations between health days and ACES/veteran
Logistic Regression: Predictors of Reporting Poor Physical Health: Binary logistic regression identifies predictors of poor health
Linear Regression: Predictors of Number of Poor Physical Health Days: Linear regression predicts number of poor health days

✍️ How to write this paper — guide, tools & examples ▾

What makes this paper effective

Systematically matches each research question to the appropriate statistical test, with explicit justification based on variable types and independence of groups.
Follows a consistent formal hypothesis-testing framework across all analyses — stating null and alternative hypotheses, reporting test statistics, and drawing conclusions tied to the significance level.
Interprets regression outputs at the individual predictor level, specifying direction, magnitude, and practical meaning of each coefficient or odds ratio.

Key academic technique demonstrated

The paper demonstrates multivariable regression interpretation — both logistic and linear — by translating raw SPSS output (B coefficients, Exp(B) odds ratios, Wald statistics, and p-values) into plain-language findings for each independent variable. This technique shows the ability to move from statistical output to substantive conclusion, a core competency in quantitative health research.

Structure breakdown

The paper is organized around six sequential analysis questions, each building in complexity: it opens with univariate visualization, moves to bivariate group comparisons (t-tests and chi-square), then to correlation analysis, and concludes with two multivariable regression models. This progression from descriptive to inferential to predictive analysis mirrors the structure of a real applied research workflow using BRFSS survey data.

📘 Read the full research paper guide → Generate citations → Build an outline → Draft a literature review → 📚 More Physical Health examples →

Graphical Description of Physical Health Days

The two graphical display options selected to describe the variable PHYSHLTH_DAYS are the histogram with normality curve and the dot plot.

Figure 1.1 shows a histogram with a normal distribution curve. The histogram was selected because it provides a view of the central tendency, spread, and shape of the dataset, including the presence of outliers. By showing the shape of the dataset, the histogram provides an at-a-glance view of whether the dataset presents a normal distribution. The dataset presents a normal distribution, as evidenced by the single-peaked, bell-shaped normality curve, with observations spread out symmetrically around the mean. No outliers are evident from the distribution.

Figure 1.1

Figure 2.1 presents a dot plot. Like the histogram, the dot plot shows the frequency distribution of different data points in the dataset. However, the dot plot provides information on the frequency of individual values rather than a range of values as the histogram does. The dots appear as complete bars due to the large number of values attached to each data point, with longer bars representing higher frequencies. Because it focuses on individual data points, the dot plot provides a more effective way of assessing whether outliers exist than the histogram. Outliers are data points that are either extremely high or extremely low compared to the rest of the data. The dot plot shows that there are no outliers in the dataset.

Figure 2.1

Group Differences in ACES Score and Alcohol Use

To test whether there is a difference in ACES score between the two groups (YES and NO for poor physical health), the independent samples t-test was used. The independent samples t-test answers this question by comparing the means of the two independent groups with respect to ACES score, in order to determine whether the mean ACES score for the group that reports YES (poor physical health) differs significantly from the group that reports NO (good physical health). The independent samples t-test is appropriate because the data meet the following requirements: (i) the dependent variable ACES score is a continuous ratio variable; (ii) the independent variable PHYSHLTH_YES_NO is a categorical variable with only two categories (Yes and No); and (iii) the groups are independent, meaning a participant cannot belong to both groups simultaneously.

The null and alternative hypotheses for the independent samples t-test are:

H₀: ACES Score_YES – ACES Score_NO = 0 (the difference of the means is equal to 0)

H₁: ACES Score_YES – ACES Score_NO ≠ 0 (the difference of the means is not equal to 0)

Before running the t-test, a comparison box plot was produced to obtain a preliminary sense of the expected results. If the means and variances of the two groups with respect to ACES score were equal, the box plots would be of equal length.

Figure 2.1

From the box plots in Figure 2.1, it is evident that the variances for the two categories differ considerably: the spread of observations for the YES category is greater than that of the NO category. This suggests that the two groups differ in ACES score. The independent samples t-test was then run to determine whether this difference is statistically significant.

Table 2.1 — Group Statistics

From the group statistics in Table 2.1, 58,968 participants reported good health, while 37,273 reported poor physical health. The mean ACES score for the YES (poor physical health) group is 2.10, while the mean for the NO group is 1.46.

Table 2.2 — Independent Samples Test

Table 2.2 presents the results of the t-test. Levene's test for equality of variances yields a significant p-value of p < 0.001. We therefore reject the null hypothesis of Levene's test and conclude that the variance in ACES score for the group reporting poor physical health (YES) is significantly different from that of the group reporting good physical health (NO). This result means we must consult the Equal Variances Not Assumed row when interpreting the t-test results. The negative t-value of −43.7 indicates that the mean ACES score for the NO group (good physical health) is lower than that of the YES group (poor physical health). The associated p-value (p < 0.001) is less than the significance level of 0.05, indicating that the difference in ACES scores between the two groups is statistically significant. We therefore reject the null hypothesis and conclude that ACES scores differ significantly between people who report poor physical health and those who report good physical health.

To test whether there is a difference in alcohol use over the last 30 days between the two groups, the independent samples t-test remains appropriate. It will determine whether alcohol consumption for the YES group (poor physical health) differs significantly from that of the NO group (good physical health). The same three conditions for appropriate use of the independent samples t-test are satisfied: (i) the dependent variable ALCOHOL is a continuous ratio variable; (ii) the independent variable PHYSHLTH_YES_NO is categorical with two categories; and (iii) the groups are independent.

The null and alternative hypotheses are:

H₀: ALCOHOL_YES – ALCOHOL_NO = 0 (the difference of the means is equal to 0)

H₁: ALCOHOL_YES – ALCOHOL_NO ≠ 0 (the difference of the means is not equal to 0)

Figure 2.2

From the box plots in Figure 2.2, the spread of observations for the two categories is nearly equal, as indicated by the approximately equal lengths of the box plots. This suggests that people in the two categories may not differ significantly in alcohol use. The independent samples t-test was run to verify this observation.

Table 2.3 — Group Statistics

From the group statistics in Table 2.3, 232,486 participants reported good health, while 145,011 reported poor physical health. The mean number of alcoholic drinks for the YES (poor physical health) group is 1.53, while the mean for the NO group is 1.75.

Table 2.4 — Independent Samples Test

Table 2.4 presents the results of the t-test. Levene's test for equality of variances yields a significant p-value of p < 0.001; we therefore reject the null of Levene's test and conclude that the variance in alcohol consumption differs significantly between the two groups. We again consult the Equal Variances Not Assumed row. The positive t-value of 21.34 indicates that the mean number of alcoholic drinks for the NO group (good physical health) is higher than that of the YES group (poor physical health). The associated p-value (p < 0.001) is less than the significance level of 0.05. We therefore reject the null hypothesis and conclude that alcohol consumption differs statistically between people who report poor physical health and those who report good physical health.

Association Between Physical Health Status and Veteran Status

The appropriate statistical test for this question is the chi-square test of independence. The chi-square test of independence tests for the presence of an association between two variables. It is the most appropriate test here for two reasons. First, both variables — PHYSHLTH_YES_NO and VETERAN — are categorical. Second, both variables consist of two categories (Yes and No), which satisfies the conditions for the chi-square test of independence. The null and alternative hypotheses are:

H₀: PHYSHLTH_YES_NO is not associated with VETERAN

H₁: PHYSHLTH_YES_NO is associated with VETERAN

Table 3.1 — Case Processing Summary

The case processing summary in Table 3.1 shows that 405,594 cases (97% of the sample) were used for the analysis, while 12,674 cases with missing values were excluded.

Table 3.2 — Crosstabulation

Table 3.3 — Chi-Square Tests

The most important result is the Pearson chi-square statistic in Table 3.3, which equals 0.128. The corresponding p-value is 0.72, which is greater than the selected significance level of 0.05 (χ² = 0.128, p = 0.72). We therefore accept the null hypothesis and conclude that there is no statistically significant association between PHYSHLTH_YES_NO and VETERAN.

To determine whether there is a difference in the rate of reporting poor physical health between veterans and non-veterans, we use the cross-tabulation in Table 3.2 to compare proportions. From the cross-tabulation, 38% of veterans (19,710 / 51,814) report experiencing poor physical health in the last month. Similarly, 38% of non-veterans (134,867 / 353,780) report poor physical health. The rate of reporting poor physical health among veterans is therefore identical to that of non-veterans, confirming that physical health status among Americans does not differ by veteran status.

3 locked sections · 1,430 words

Correlation Between Physical Health Days, ACES Score, and Veteran Status310 words

The Pearson correlation test was used to test for correlation between PHYSHLTH_DAYS and ACES_Score. Both PHYSHLTH_DAYS and ACES_Score are continuous variables, making the Pearson correlation…

Logistic Regression: Predictors of Reporting Poor Physical Health580 words

Binary logistic regression is the most appropriate test for this analysis. Logistic regression is used when the dependent variable — in this…

Linear Regression: Predictors of Number of Poor Physical Health Days540 words

Linear regression is the most appropriate test for this analysis. Linear regression is used when the dependent variable — in this…

Read the full paper →

Plus 130,000+ examples & all writing tools

Key Concepts in This Paper

BRFSS Survey ACES Score Veteran Status Logistic Regression Linear Regression Physical Health Days Chi-Square Test Pearson Correlation Independent T-Test Odds Ratio