- Length: 5 pages
- Subject: Education - Mathematics
- Type: Term Paper
- Paper: #5225693
- Related Topics:
__Ice Cream__,__Cigarette__,__Regression Analysis__,__Christmas__

For this review of statistical methods, the following data table will be used. This data is a measure of the tar, nicotine, and CO2 which is produces while a given cigarette brand is smoked. The data presented below is taken from Mendenhall and Sincich (1992) and is a subset of the data produced by the Federal Trade Commission. It was submitted by Lauren McIntyre, Department of Statistics, North Carolina State University.

Brand

Tar (mg)

Nicotine (mg)

Weight (g)

Carbon Monoxide (mg)

Alpine

Benson & Hedges

Bull Durham

Camel Lights

Carlton

Chesterfield

Golden Lights

Kent

Kool

M

Lark Lights

Marlboro

Merit

Multi-Filter

Newport Lights

Now

Old Gold

Pall Mall Light

Raleigh

Salem Ultra

Tareyton

TRUE

Viceroy Rich Light

Virginia Slims

Winston Lights

Statistical data can be categories in the following groups.

Statistical data

Categorical Data

Continuous Data

Can be Divided into Can be Divided into Nominal Data

Ordinal Data

Interval Data

Ratio Data

Also called:

Non-metric Data

Qualitative Data

Nonparametric Data

Attribute Data

Also Called:

Metric Data

Quantitative Data

Parametric Data

Variable Data

Nominal Data is data that can be categorized, but cannot be ranked based in intensity, nor its magnitude. Examples of nominal data include political parties, religions, favorite flavors of ice cream. Ordinal Data is data that can be categorized, and ranked by class, but whose magnitude cannot be measured For example, ordinal data can be rated by a scale such as 'Excellent-Good-Fair-Poor-Bad.' Interval Data is data that can be categorized, ranked, and whose magnitude can be measured. For example, student Grade Point Averages, SAT scores, can be both measures, and ranked according to age, gender, or nationality of the student. Ratio Data is data that can be categorized, ranked, and whose magnitude can be measured, and is such that a score of zero is a valid score, and represents the total absence of the trait being measured. For example, a person's height, or the temperature can be used in ratio data calculations.

Frequency distribution is the measure of the frequency which a particular data presents itself across a given sampling. A chart or table showing how often each value or range of values of a variable appears in a data set is considered a frequency distribution. For example, the number of accidents occurring within the population of teenage driver would create a frequency distribution. Central tendency is a measure of location of the middle or the center of a distribution. The mean or average value is the most commonly used measure of central tendency. Calculated from the cigarette data above, the Mean tar grams for a cigarette is 12.216 milligrams.

A weighted average is a measure which gives additional weight to the occurrence of measured data based on population sampling. Returning to our teenage driver example, an average measure of teen accident per 1000 teen drivers may produce a general figure. A more accurate measurement that could be accurately applied to all teens would be to produce weighted averages which took into account factors such as drugs and alcohol, or number of passengers in the vehicle and how these factors weighted the occurrence of accidents among teen drivers. From a weighted average computation of this type, probability distributions could be plotted regarding the likelihood of a teen accident, based on the additional factors.

Normal distributions for data sets will typically fall within a bell shaped curve. Often just called the bell-curve or bell-shaped curve, which measures the occurrence of most scores…

Nonlinear trends in statistical data can be the most challenging to work with. When non-linear relationships exist, there may be a mathematical relationship which is based on a logarithm, or other multi-factor influence. However, true non-linear relationship, such as the height and weight of a specific person who shops in a given department store may leave the statistician without any relationship whatsoever. Non-linear data can also be the result of data which is being acted on by an artificial, outside force. In this case, the statistician is able to verify the existence of an outside force, and then approach the process of identifying the force.

An example of this situation is the expected relationship between supply and demand, and company profit based on the sales of a given product in the market place. In the early 1980's, the Coleco company produces a product called "Cabbage Patch dolls." The typical lifecycle of a new toy product is one to two years, but Coleco was able to extend the life of their product for four to five Christmas seasons by artificially affecting the relationship between supply and demand. The company had the production capacity to produce 4-5 times the amount of dolls which it shipped to the market during the first three years of the dolls life cycle. This would have produced a typical bell shaped curve, plotting a rising demand, and increasing profits which gave way to a declining demand and declining profits in a short period. However the company did not produce product equal to their capacity, nor equal to the demand. As a result, the company was able to continue a high level of demand, and an inflated retail based on the high demand for an extended period. The result was that the doll stayed popular for almost a decade, and the company was able to reap ongoing higher levels of profits. The longer bell curve, identified by an irregular and nonlinear relation between time and supply and demand was created by the unique marketing strategy for the company.