There is a number of mathematical and statistical tools that businesses use to survive and thrive in their respective markets. Some of the math involved is quite simple and basic. Examples of such basic operations would include percentages, standard deviations and so forth. However, there are some fields and realms where much more intricate mathematics are involved and statistics would be a common example of such a complex method. This brief report shall specifically cover the use of multivariate statistics. Three of the more common manifestations of multivariate statistics are factor analysis, cluster analysis and multidimensional scaling. While making things overly complex from a numbers standpoint is usually not wise, there are situations where more robust and complex analysis is necessary or advantageous.
The first multivariate method that shall be discussed is factor analysis. Factor analysis is a technique that is used to reduce a large number of variables down to a smaller list of factors and items. The goal of the technique is to extract the maximum level of common variance from the variables in question so as to come to a common score. When the factors in question are all indexed and put together, they can be used for scoring and other further analysis. Factor analysis is part of the larger general linear model, or GLM. There are also some common assumptions and norms when it comes to factor analysis. This includes that there is no linear relationship when it comes to the variables involved and there is also no multi-collinearity. What is present is some sort of correlation between the variables and factors. With all of that being said, there are five common types of factor analysis. The most common and ubiquitous of the five is known as principal component analysis, or PCA. It starts with extracting the maximum variance and this becomes part of the first factor. After that, the variance is removed when it can be explained by the first factors and there is then the extraction of maximum variance for the second. This process is repeated until the last factor is reached. The second most common method is known as common factor analysis. There is the extraction of the common variance and that variance is placed among the factors within the analysis. There is no looking at the unique variance for all variables. This is the method used in SEM, which is short for standard error of the mean (Statistics Solutions, 2017). This is the spread that the mean of a sample of values would resemble if one were to keep taking in values and scores (Sports CI, 2017). The other more common methods of factor analysis are image factoring, maximum likelihood method, least squares and alfa factoring. Finally, there is weight square which is a regression-based method that is used for factoring (Statistics Solutions, 2017). Factor analysis is commonly used as a marketing tool to help analyze and assess the market landscape. The research firm to which this report is being presented can do the same thing when it comes to their marketing and the recommendations they give to their clients (B2B International).
A method of multivariate statistics that complements and often accompanies factor analysis is cluster analysis. Further, there is commonly a sequence that includes, in order, factor analysis, cluster analysis and discrimination analysis. Also like factor analysis, cluster analysis is typically about trying to group and assemble widespread data plots that are perhaps not obviously related and kept together. It is a method that is used to identify structures within the data. It is commonly also referred to as segmentation analysis or taxonomy analysis. The idea is to identify homogenous groups within wider arrays of data. The analysis is explorative in nature and does not distinguish between dependent and independent variables. This is a method of analysis that can commonly done with statistical program packages like IBM's SPSS and similar statistical suites. Just a few of the cluster analysis variants that are commonly used include binary, nominal, ordinal and scale. The scale type can be further divided into interval and ratio sub-types. Like the aforementioned factor analysis, cluster analysis also has a multicollinearity. Industries and job types that commonly use cluster analysis include medicine, marketing, education and biology. A real-world example of this method being used is when organizations behind standardized testing batteries assemble the data after the fact. Indeed, there will be clusters and groupings of scores and those groupings can be parsed and analyzed. A modern company. The research firm to which this report is being presented can use this method themselves to help arrange and interpret data that scores on a wide and seemingly random plotting surface such as performance scores, test scores relating to whether employees know what they need to know about doing their job and so forth (Statistics Solutions, 2017).
The first two types of statistical analysis mentioned in this report were complementary and sequential in nature. However, this is not the case with multidimensional scaling, or MDS. Indeed, MDS is actually seen as a direct alternative to factor analysis, as was discussed earlier. The goal of MDS, by contrast, is to detect underlying dimensions and details that are seemingly meaningful and applicable to the data set in question. The observations garnered are meant to show the similarities between the objects, such as variables, and such that are expressed within the correlation matrix. MDS can be used for any sort of similarity or dissimilarity matrix, not to mention correlation matrices. A real-world example of this would be if an airline or other travel firm wanted to look at and assess the distances from one city to another and apply that to a number of different cities. For example, an airline could include cities like New York, Houston, Miami, Kansas City, Denver and Seattle (just to give a short list) and plot the distances of those cities on a two-dimensional basis. In other words, the distances from New York to all of the other cities would be plotted. The process would then be repeated for all of the other combinations of cities. The research firm for which this report is being prepared could use this same overall tactic in a number of ways including how long it will take to ship research results to a firm, the time it will take to do each method of research, the quality and performance of the results involved and so forth. The way in which the data is arranged and placed varies based on the application. Indeed, there are differing options that exist including measure of goodness/fit, the Shepard diagram and so on. One thing that is extremely important to do well when it comes to all of this is the number of dimensions involved. The city-to-city network mentioned before would typically be handled in two dimensions. Those dimensions would be north/south and east/west. There are some situations where a third dimension (or more) might matter. If the routes of a ski slope are being looked at, the elevations involved (up/down . . . the third dimension) might matter a great deal. On a relatively or very flat surface, elevation may not matter at all. Differing locations in a flatter state like Kansas, for example, would be a good example. The goal is to have the right number of dimensions before getting into the thick of the analysis. There should be precisely as many dimensions as there needs to be . . . no more and no less (Stat Soft, 2017).
You’re 86% through this paper. Sign up to read the full paper.
Sign Up Now — Instant Access Already a member? Log inAlways verify citation format against your institution’s current style guide requirements.