Reliability & Validity for the Lay Person

Reliability & Validity

For the lay person, the notion of personality is often derived from components of an individual's character or make up that has the ability to elicit positive or negative reactions from other individuals. The person who has a propensity for positive reactions from others is often thought to have a 'good' personality. Conversely, the person who tends to elicit not so favorable reactions from others may be thought to have a 'bad' personality. However, when behavioral and social scientists seek to describe and define personality, the terminology used is far more rigorous that describing simple social skills (Cohen, Montague, Nathanson & Swerdlik, 1988). As such, constructs such as personality traits, personality states and personality types have been studied as a means of provided clinically accurate ways in which to define personality.

Nevertheless, there is no one globally accepted definition of personality within the scholarly literature. McClelland (1951, p. 69) defined personality as "the most adequate conceptualization of a person's behavior in all its detail." While Menninger (1953, p. 23) defined personality as "the individual as a whole, his height and weight and love and hates and blood pressure and reflexes; his smiles and hopes and bowed legs and enlarged tonsils. It means all that anyone is and that he is trying to become." Although no one definition of personality has been globally accepted in the scholarly world, there are some components and constructs of personality that have been more widely accepted.

Factor Analysis in constructing Personality Testing

Because there are so many ways in which to describe an individual's personality, those interested in personality frequently use a statistical tool to simplify the enormous amounts of information available by placing similar information into clusters known as factor analysis. The premise behind factor analysis suggests that if two or more characteristics correlate, they may reflect an underlying trait that is shared; thereby creating patterns of correlations that reveal the trait dimensions existing beneath the measure qualities (Tabachnik & Fidell, 2005). Factor analysis is a more complex version of correlation, however, in the sense that instead of examining correlation between just a few variables, factory analysis uses a great number of correlations among a great number of variables (Kline, 1994).

In order for factor analysis to be completed, the researcher first collects data on many variables across a significant number of individuals. The data can be collected in any number of ways, but what is important is that the same data is collected from everyone participating. Upon collection of the data, the researcher then calculates the correlations between every conceivable and possible pair of variables. In this way, the factor, in personality research is commonly viewed as a reflection of a personality trait (Gorusch, 1983). Researchers use factor analysis to construct and refine personality tests. Although the label of a factor is primarily something that has been inferred from a cluster of correlating variables, there is the assumption that personality tests scores directly reflect the individual's personality traits with little to no error. Factor analysis is determined useful in personality testing because it simplifies the various ways a person is understood by reducing the information into smaller more manageable sets of personality traits. Factor analysis provides a basis or contextual framework that perhaps some traits are more important than others when derived from large highly correlating clusters. And factor analysis is very useful in creating personality measures. However, it is important to remember that factor analysis' usefulness is contingent upon the information that the researcher inputs; resultantly, the facts that emerge are largely dependent on the kind of data collected or the variables that were included in the process of analysis (Kline, 1994).

Reliability of Personality Tests

Reliability in personality testing is a measure of consistency. If a measure (trait) on a personality test was considered reliable, then there is the expectation that almost identical scores would be achieved on the retest. The smaller the variance between the two scores, the more accurate or reliable the measure is said to be. Reliability of a measure is determined on the correlation coefficient which has a range from +1.00 to -1.00. The correlation coefficient measure the strength between the two variables. For example, if a coefficient approaches plus or minus 1.00, then a strong relationship is determined with a +1.00 reflecting a positive relationship and a -1.00 representing a negative relationship. If the result would be 0.00, then no relationship is indicated (Joint Committee, 1999).

The most frequently used method of establishing reliability in personality testing is the test-retest method. In order to achieve this, the same individual is tested at two separate points in time and a correlation coefficient is established to ascertain if the scores on the initial test are related to the scores on the subsequent test. A high correlation coefficient indicates to the researcher that the individual scores on the initial test are very similar to the scores on the subsequent test. The importance and significance of reliability, then, demonstrates that if the measurements are not consistent, there is not value or benefit to be derived from the personality test (Joint Committee, 1999).

Validity of Personality Tests

Validity in personality testing refers to research that offers evidence that a test actually measures what it is intended to measure. Test validity is determined to be very important as it provides the test taker a level of assurance that the information derived about him or her is accurate. The rules for establishing test validity consider four areas of evidence to include: (a) evidence from response processes or response process validity; (b) evidence from test content or content validity; (c) evidence from test structure of structure validity; and (d) evidence based on relations to other variables or criterion validity (Joint Committee, 1999).

Response process validity looks at the mental processes an individual goes through in deriving an answer to the test. There is an assumption by the test creator that the person's cognitive processes reflect what the test is designed to measure. In content validity, the question asked is "are the items used in collecting the data accurately representative of the specific domain?" (Joint Committee, 1999). There are both informal and formal ways to determined content validity. One method frequently used is the item or logical sampling approach which uses a multiple step process involving a careful definition of the domains of behaviors that are sampled from. Test or structural validity presupposes to answer how many things does the test measure. Structural validity looks at the degree to which all the items on the test rise and fall together or, whether one set of items rise and fall in one pattern and other groups of items rise and fall in a different pattern.

With evidence based on relations to other variables or criterion validity, helps in determined whether the test provides accurate predictions as it is designed to do. Criterion validity involves test score relation to other variables and can be examined by the use of discriminate validity, convergent validity, and of course criterion validity (Joint Committee, 1999).

Discriminant validity indicates that the test should not correlate with different concepts of constructs that the test is designed to measure. Convergent validity indicates that the test should highly correlate with other tests that examine the same concepts. Within criterion validity, the tests scores should relate to a criterion determined to be significant. Subtypes under criterion validity include predictive relationships in which the test scores predict future outcomes; concurrent relationships in which the test score and criterion are evaluated simultaneously and post-dictive, wherein the test score predicts backward to the individuals historical data.

Applicability of Personality Tests

With regard to whether or not personality tests can be used as a means of projecting potential or continued employment success, it is first most important to ensure that the personality test utilized has both a fairly high level of validity and reliability. Without these two factors, the usefulness of the test results will be significantly diminished because of the low consistency in the results and the test failing to measure what it is intended to measure. The data collected from these personality tests, then, would be of little use from a historical perspective for continued employment. If personality tests are used as a pre-employment measure in the employment of person that requires certain qualities or behavioral traits, again the tests should be reliable and valid. If an instrument is unreliable, it would then prove difficult to relate the behaviors to any kind of theoretic behavioral model as the rules will vary instead of being consistent.

Although personality tests cannot determine with 100% accuracy how an individual will or continue to perform tasks, it does provide some valuable information as to what may be expected regarding performance, particularly if there is historical tests data to compare it to. However, because of the level of predictability that can be derived using personality tests that are determined reliable and valid, a certain amount of conjecture can be used…

