Data Analysis Chapter Undergraduate 1,488 words Human Written

Using Numbers for Investigate Distributions

Last reviewed: ~7 min read Mathematics › Numbers
80% visible
Read full paper →
Paper Overview

Investigate Distributions with Numbers Part 1 (10 points) 1. Describe three different ways to measure the center of a data set. Give an example where one measure of the center is preferred over another. The three different ways to measure the center of a data set include the mean, mode, and median. First, the mean is equivalent to the summation of all the values...

Full Paper Example 1,488 words · 80% shown · Sign up to read all

Investigate Distributions with Numbers

Part 1 (10 points)

1. Describe three different ways to measure the center of a data set. Give an example where one measure of the center is preferred over another.

The three different ways to measure the center of a data set include the mean, mode, and median. First, the mean is equivalent to the summation of all the values in the data set divided by the number of values in the data set. Secondly, the median happens to be the mid-range data for a set of data that has been arranged in ascending order. Lastly, mode is the data that occurs most frequently in the data set.

Median is preferred over the others because it is less impacted by skewed data and outliers.

2. Explain the quartiles of a distribution in terms of percentiles

The quartiles of a distribution include the first percentile, second quartile which is the median and the third percentile. The first quartile is equivalent to the 25th percentile, second quartile equivalent to the 50th percentile and third quartile equivalent to the 75th percentile.

3. Describe the different components of a box plot. Use the items included in the five-number summary.

The components of a box plot include the following:

1. Minimum – This is the smallest number in the data set

2. First Quartile – When the data set is arranged in ascending order from the least to the highest, and the data is split into four groups, the first quartile is the data at the lower fourth mark of the data

3. Median – When the data set is arranged in ascending order from lowest to highest, the median happens to be the data in the middle of the data set

4. Third Quartile - When the data set is arranged in ascending order from least to the highest, and the data is split into four groups, the third quartile is the data at the upper fourth mark of the data

5. Maximum – This is largest number in the data set

4. Describe the IQR rule for identifying outliers. Then, create a mock data set with at least 12 data points and with at least two outliers. Justify the outliers by applying the IQR rule.

IQR is calculated by subtracting the 1st Quartile from the 3rd Quartile

The rule for identifying outliers is as follows:

Multiply IQR by 1.5 and subtract the 1st Quartile

Multiply IQR by 1.5 and add the 3rd Quartile

Any numbers that lie outside these figures are outliers

Consider the following data set

Minimum

1st Quartile

Median

3rd Quartile

Maximum

IQR = 36.75 – 24.75 = 12

12 × 1.5 – 24.5 = -6.5

The outliers are 55 and 65

5. Write a short paragraph that defines standard deviation explains its importance. Explain the difference between population standard deviation and sample standard deviation

Standard deviation is a metric that indicates the dispersion of a data set from its mean. This measure is computed as the square root of the variance by ascertaining the variation between every data point within the data set in relation of the mean. If such data points are significantly far away from the mean, it implies that the data set has high deviation and vice versa.

6. Find the sample standard deviation of the following data sets {10, 12, 16, 20, 22}. Show all steps of the calculation.

Standard Deviation = ?? [(x - µ) / N]

Step 1: Find the mean µ

Step 2: Find the square of the distance (x - µ)2

X

(X - µ)2

Step 3: Find Standard Deviation

SD = ?? [(x - µ) / N]

7. The prices of a gallon of gasoline at 12 New York City gas stations in August 2016 were:

Based on this data set of 12 gas stations:

a) Find the mean price of gasoline.

Mean price of gas = $34.14 / 13 = $2.63

b) Find the median price of gasoline

When arranged from smallest to largest, the data set becomes as follows:

The median price is the one in the middle, which is $2.49

c) Find the range of gasoline prices

The range is the difference between the highest and the lowest value. The highest value is $3.99 whereas the lowest value is $2.15

Therefore, the range is $1.84

d) Find the five-number summary for gasoline prices.

Minimum = $2.15

First Quartile = $2.19

Median = $2.49

Third Quartile = $2.84

Maximum = $3.99

8. What is the 68-95-99.7 rule for a normal distribution?

The 68-95-99.5 rule for normal distribution states that in a normal distribution that has a mean µ and standard deviation ?,

i. Roughly 68 percent of the observations lie within ? of the mean µ

ii. Roughly 95 percent of the observations lie within 2? of the mean µ

iii. Roughly 99.7 percent of the observations lie within 3? of the mean µ (Moore, 2010).

9. Find three different items that are normally distributed. Give references used

Three things that take the normal distribution include body temperature, the diameter of a tree and the sizes of shoes (Weiers, 2010).

10. What is meant by the phrase standard normal distribution?

This is a normal distribution that has a mean of 0 and a standard deviation of 1

11. Explain what a z-score is and why it is important

A z-score can be defined as a numerical measurement of a value’s correlation to the mean in a set of values. The inference of this is that of the z-score has a measure of 0, it signifies the score as equivalent to the mean. Secondly, if the z-score happens to be positive, it means that the score is higher than the mean whereas in case the z-score happens to be negative, then it means that the score is less than the mean. The importance of the z-score is that it gives a specification of the distance from the mean and counts the number of the standard deviation between the number X and the mean µ (Gravetter and Wallnau, 2008).

12. How can one determine from a histogram if a distribution is approximately normal? 

For a histogram, the data can be skewed either to the left or to the right

Part 2

1. Use Excel to obtain the following. Place your results in your Word file.

a) Find the five-number summary for the following data below. Hint, use the Excel statistic function called QUARTILE

Minimum = 3

First Quartile = 26.5

Median = 33.5

Third Quartile = 37

Maximum = 57

b) Find the IQR and use it to determine if there are any outliers. 

IQR = 3rd Quartile – 1st Quartile

Determining outliers

IQR × 1.5 - 1st Quartile

IQR × 1.5 + 3rd Quartile

= 11.5 × 1.5 – 26.5 = -9.25

There is only one outlier, which is 57

2. Use Excel to determine the mean and sample standard deviation for the data given in Problem 1 of Part 2. Hint, use the Excel functions AVERAGE and STDEV. 

Mean = 31

Standard Deviation = 10.66721

3. The supply manager of a university orders all supplies, including items for the athletics department. Before the football season, he must develop a separate inventory list for the football team. This list will include supplies for both the players and the department itself. Although the department budget is set in terms of inventory (based on historical data), the football team’s needs change based on the size of the team as well as its individual players. The historical data shows that the number of gallons of Gatorade consumed by a football team during a game follows a normal distribution with mean 20. The standard deviation is 3. To help with the decision of how much Gatorade to order for each game, the supply manager would like to know the following information. 

298 words remaining — Conclusions

You're 80% through this paper

The remaining sections cover Conclusions. Subscribe for $1 to unlock the full paper, plus 130,000+ paper examples and the PaperDue AI writing assistant — all included.

$1 full access trial
130,000+ paper examples AI writing assistant included Citation generator Cancel anytime
Sources Used in This Paper
source cited in this paper
3 sources cited in this paper
Sign up to view the full reference list — includes live links and archived copies where available.
Cite This Paper
"Using Numbers For Investigate Distributions" (2018, February 11) Retrieved April 21, 2026, from
https://www.paperdue.com/essay/numbers-investigate-distributions-data-analysis-chapter-2177543

Always verify citation format against your institution's current style guide.

80% of this paper shown 298 words remaining