This paper presents a comprehensive statistical analysis of 2019 Behavioral Risk Factor Surveillance System (BRFSS) data to identify characteristics that predict self-reported poor physical health. Using SPSS, the analysis applies independent samples t-tests, chi-square tests of independence, Pearson correlations, binary logistic regression, and multiple linear regression. Key variables examined include adverse childhood experiences (ACES), veteran status, sex, income, employment, education, smoking, exercise, alcohol use, BMI, and age. Results indicate that ACES score, employment status, income, exercise, age, and BMI are among the strongest predictors of both the likelihood and frequency of poor physical health days, while veteran status shows no significant association with either outcome.
The two graphical display options selected to describe the variable PHYSHLTH_DAYS are the histogram with normality curve and the dot plot.
Figure 1.1 shows a histogram with a normal distribution curve. The histogram was selected because it provides a view of the central tendency, spread, and shape of the dataset, including the presence of outliers. By showing the shape of the dataset, the histogram provides an at-a-glance view of whether the dataset presents a normal distribution. The dataset presents a normal distribution, as evidenced by the single-peaked, bell-shaped normality curve, with observations spread out symmetrically around the mean. No outliers are evident from the distribution.
Figure 1.1
Figure 2.1 presents a dot plot. Like the histogram, the dot plot shows the frequency distribution of different data points in the dataset. However, the dot plot provides information on the frequency of individual values rather than a range of values as the histogram does. The dots appear as complete bars due to the large number of values attached to each data point, with longer bars representing higher frequencies. Because it focuses on individual data points, the dot plot provides a more effective way of assessing whether outliers exist than the histogram. Outliers are data points that are either extremely high or extremely low compared to the rest of the data. The dot plot shows that there are no outliers in the dataset.
Figure 2.1
To test whether there is a difference in ACES score between the two groups (YES and NO for poor physical health), the independent samples t-test was used. The independent samples t-test answers this question by comparing the means of the two independent groups with respect to ACES score, in order to determine whether the mean ACES score for the group that reports YES (poor physical health) differs significantly from the group that reports NO (good physical health). The independent samples t-test is appropriate because the data meet the following requirements: (i) the dependent variable ACES score is a continuous ratio variable; (ii) the independent variable PHYSHLTH_YES_NO is a categorical variable with only two categories (Yes and No); and (iii) the groups are independent, meaning a participant cannot belong to both groups simultaneously.
The null and alternative hypotheses for the independent samples t-test are:
Hβ: ACES ScoreYES β ACES ScoreNO = 0 (the difference of the means is equal to 0)
Hβ: ACES ScoreYES β ACES ScoreNO β 0 (the difference of the means is not equal to 0)
Before running the t-test, a comparison box plot was produced to obtain a preliminary sense of the expected results. If the means and variances of the two groups with respect to ACES score were equal, the box plots would be of equal length.
Figure 2.1
From the box plots in Figure 2.1, it is evident that the variances for the two categories differ considerably: the spread of observations for the YES category is greater than that of the NO category. This suggests that the two groups differ in ACES score. The independent samples t-test was then run to determine whether this difference is statistically significant.
Table 2.1 β Group Statistics
From the group statistics in Table 2.1, 58,968 participants reported good health, while 37,273 reported poor physical health. The mean ACES score for the YES (poor physical health) group is 2.10, while the mean for the NO group is 1.46.
Table 2.2 β Independent Samples Test
Table 2.2 presents the results of the t-test. Levene's test for equality of variances yields a significant p-value of p < 0.001. We therefore reject the null hypothesis of Levene's test and conclude that the variance in ACES score for the group reporting poor physical health (YES) is significantly different from that of the group reporting good physical health (NO). This result means we must consult the Equal Variances Not Assumed row when interpreting the t-test results. The negative t-value of β43.7 indicates that the mean ACES score for the NO group (good physical health) is lower than that of the YES group (poor physical health). The associated p-value (p < 0.001) is less than the significance level of 0.05, indicating that the difference in ACES scores between the two groups is statistically significant. We therefore reject the null hypothesis and conclude that ACES scores differ significantly between people who report poor physical health and those who report good physical health.
To test whether there is a difference in alcohol use over the last 30 days between the two groups, the independent samples t-test remains appropriate. It will determine whether alcohol consumption for the YES group (poor physical health) differs significantly from that of the NO group (good physical health). The same three conditions for appropriate use of the independent samples t-test are satisfied: (i) the dependent variable ALCOHOL is a continuous ratio variable; (ii) the independent variable PHYSHLTH_YES_NO is categorical with two categories; and (iii) the groups are independent.
The null and alternative hypotheses are:
Hβ: ALCOHOLYES β ALCOHOLNO = 0 (the difference of the means is equal to 0)
Hβ: ALCOHOLYES β ALCOHOLNO β 0 (the difference of the means is not equal to 0)
Figure 2.2
From the box plots in Figure 2.2, the spread of observations for the two categories is nearly equal, as indicated by the approximately equal lengths of the box plots. This suggests that people in the two categories may not differ significantly in alcohol use. The independent samples t-test was run to verify this observation.
Table 2.3 β Group Statistics
From the group statistics in Table 2.3, 232,486 participants reported good health, while 145,011 reported poor physical health. The mean number of alcoholic drinks for the YES (poor physical health) group is 1.53, while the mean for the NO group is 1.75.
Table 2.4 β Independent Samples Test
Table 2.4 presents the results of the t-test. Levene's test for equality of variances yields a significant p-value of p < 0.001; we therefore reject the null of Levene's test and conclude that the variance in alcohol consumption differs significantly between the two groups. We again consult the Equal Variances Not Assumed row. The positive t-value of 21.34 indicates that the mean number of alcoholic drinks for the NO group (good physical health) is higher than that of the YES group (poor physical health). The associated p-value (p < 0.001) is less than the significance level of 0.05. We therefore reject the null hypothesis and conclude that alcohol consumption differs statistically between people who report poor physical health and those who report good physical health.
The appropriate statistical test for this question is the chi-square test of independence. The chi-square test of independence tests for the presence of an association between two variables. It is the most appropriate test here for two reasons. First, both variables β PHYSHLTH_YES_NO and VETERAN β are categorical. Second, both variables consist of two categories (Yes and No), which satisfies the conditions for the chi-square test of independence. The null and alternative hypotheses are:
Hβ: PHYSHLTH_YES_NO is not associated with VETERAN
Hβ: PHYSHLTH_YES_NO is associated with VETERAN
Table 3.1 β Case Processing Summary
The case processing summary in Table 3.1 shows that 405,594 cases (97% of the sample) were used for the analysis, while 12,674 cases with missing values were excluded.
Table 3.2 β Crosstabulation
Table 3.3 β Chi-Square Tests
The most important result is the Pearson chi-square statistic in Table 3.3, which equals 0.128. The corresponding p-value is 0.72, which is greater than the selected significance level of 0.05 (ΟΒ² = 0.128, p = 0.72). We therefore accept the null hypothesis and conclude that there is no statistically significant association between PHYSHLTH_YES_NO and VETERAN.
To determine whether there is a difference in the rate of reporting poor physical health between veterans and non-veterans, we use the cross-tabulation in Table 3.2 to compare proportions. From the cross-tabulation, 38% of veterans (19,710 / 51,814) report experiencing poor physical health in the last month. Similarly, 38% of non-veterans (134,867 / 353,780) report poor physical health. The rate of reporting poor physical health among veterans is therefore identical to that of non-veterans, confirming that physical health status among Americans does not differ by veteran status.
"Pearson correlations between health days and ACES/veteran"
"Binary logistic regression identifies predictors of poor health"
"Linear regression predicts number of poor health days"
You’re 42% through this paper. Sign up to read the remaining 3 sections.
Sign Up Now — Instant Access Already a member? Log inAlways verify citation format against your institution’s current style guide requirements.