This paper presents a series of worked bivariate analysis problems demonstrating the application of chi-square tests and independent-samples t-tests. Problems address selecting appropriate tests for different data types, computing chi-square statistics from observed and expected frequency tables, collapsing response categories to meet chi-square assumptions, and interpreting results relative to critical values. T-test calculations are demonstrated for comparing loan repayment rates across institution types and manager performance ratings across geographic regions. Throughout, the paper emphasizes null hypothesis logic, degrees of freedom, significance thresholds, and the conditions under which standard tests may be inappropriate.
The following test types are appropriate for each situation described:
a) Chi-square, with the base hypothesis that all political groups contribute equal amounts.
b) Chi-square, with a base hypothesis appropriate to the attitude question being asked.
c) T-test.
d) Chi-square, with the base hypothesis of equal average salaries between regions.
The choice between a chi-square test and a t-test depends primarily on the level of measurement involved: chi-square tests are used for categorical (nominal or ordinal) data, while t-tests are used when comparing means of continuous variables.
The observed frequencies for this question were as follows:
One plausible hypothesis is that managers and blue-collar workers do not hold very different opinions about workplace regulation. Some regulations may make the workplace safer for managers as well as blue-collar workers. Additionally, many blue-collar workers are politically conservative, which would incline them against over-regulation of the workplace. Given these considerations, the expected values for the chi-square test are:
The cell-level calculations are:
χ² = 2.46. This does not exceed the critical value of 3.84 for a chi-square test with α = 0.05 and df = 1. Therefore, we cannot reject the null hypothesis; these data are not significantly different from the expected values.
The observed frequencies for home ownership by gender were:
Here, the null hypothesis is that home ownership has equalized across genders. The expected value table is:
χ² = 0.92. We again fail to reject the null hypothesis.
The observed shopper age distributions across two stores were:
The null hypothesis is that both stores draw proportionally from the same age groups, although Store B draws more customers overall. The expected value table is:
χ² = 11.39. This value exceeds the critical χ² value for df = 2 at α = 0.05. Therefore, we can conclude that at least one of the observed values is significantly different from its expected value. Without post-hoc pairwise tests it is impossible to determine exactly which group drives the difference; however, we can reasonably hypothesize that the proportion of 55+ shoppers in Store A is statistically different from what would be expected by chance.
When a contingency table contains cells with very small expected frequencies, the chi-square test's assumptions are violated. In such cases, adjacent response categories must be collapsed before the test is performed. After collapsing, the ownership-by-education table becomes:
χ² = 6.49. This does not exceed the critical χ² value for df = 3, so we cannot conclude that there is any significant difference between the observed counts of home ownership by educational level and those expected by chance.
For the sample composition question (Question 4), a goodness-of-fit chi-square test to determine whether the sample is significantly different from the expected population distribution is most appropriate. The data yield χ² = 2.51, which is below the critical value cutoff for α = 0.05. We can therefore assume that the sample is not significantly different from the general population.
For the commuting pattern data (Question 5), the analysis shows no gender-based difference in the way people commute to work. With χ² = 7.715, df = 3, and p > 0.10, the result does not exceed the critical χ² value of 7.81 for this analysis.
To test whether loan repayment rates differ between Savings & Loan institutions and other types of lending institutions, a t-test of means is appropriate. The critical t-value for α = 0.05 is (conservatively) 1.98.
The standard error of the difference between means is calculated as:
SE = √((var₁/n₁) + (var₂/n₂)) = √((0.5²/100) + (0.6²/64)) = 0.09
The t-statistic is then:
T = (M₁ − M₂) / SE = 1 / 0.09 = 11.11
This far exceeds the critical t-value. According to this analysis, there is a significant difference between loan repayment rates at Savings & Loan institutions and those at other institutions, at least within this sample.
"Regional manager performance comparison by t-test"
"One-tailed tests, borderline significance, assumption violations"
Always verify citation format against your institution’s current style guide requirements.