How do you interpret the reliability results for the clerical test and work sample? Are they favorable enough for the company to consider using them for keeps in selecting job applicants?
A primary objective of evaluating two new methods of assessing candidates for positions as telephone customer service representatives for the Phonemin Company is to improve the caliber of the employment pool from which new hires are selected. The participation of customer service representatives in the telephone ordering system of the company is critical to this endeavor. Moreover, the company will be adding roughly 40 employees to the call center in order to meet the anticipated growth in phone order sales. From this, it is readily apparent that effective means of assessing candidates for the positions of telephone customer service representatives is needed. The reliability figures for the current employee candidate system, which is comprised of a clerical work test and two work samples, are as follows:
For the clerical test, reliability is indicated with a high alpha coefficient of 0.85 and 0.86. This is a reasonably high reliability score. And the test-retest reliability score is even higher, at a very solid 0.92.
The work sample (T) reliability scores show an inter-rater agreement at 88% and 79%, both of which are adequate scores for inter-rater reliability. Higher reliability scores would be preferable. The work sample (C ) inter-rater reliability scores are 81% and 77%. Again, these inter-rater agreement percentages are adequate, but lower than the percentage typically desirable for inter-rater reliability scores.
Overall, the company can be comfortable using these the work samples and the clerical work test as measures to assist with the selection of candidates for positions as telephone order sales customer service representatives. The company should, however, consider undertaking measures to improve the inter-rater reliability scores for the assessments. Typical ways of improving inter-rater agreement are to increase specificity in the rating protocol and to provide additional training and periodic recalibration of raters. This is so because a known weakness of inter-rater reliability is drift, a term used to indicate movement away from the standards set by the rating protocol by the raters over time, and an increasing distance between the raters scores from each other (Shoukri, 2010).
2. How do you interpret the validity results for the clerical test and work sample? Are they favorable enough for the company to consider using them for keeps in selecting new job applicants?
Validity is a pivotal metric with any assessment. Psychometrists take precautions to ensure that validity of the instruments they use is sufficiently robust to enable confidence in the measures. Validity figures are achieved through scientific measures that are designed to assess the particular type of validity under review. Scientists regularly utilize the following types of validity: Content validity (types are face validity, curricular validity), criterion-related validity (types are predictive validity, concurrent validity), construct validity (types are convergent validity, discriminant validity), and consequential validity ("Validity evidence," 2014). Content validity examines the match between the content (often a subject area) that is being assessed and the actual test questions ("Validity evidence," 2014). Curricular validity is an expression of the extent to which a test matches specific objectives of a curriculum that is part of the training or education processes ("Validity evidence," 2014). Criterion-related validity examines the relationship between test scores and some relevant outcome, which in the assessment of candidates for employment is actual job performance ("Validity evidence," 2014). Construct validity is the degree to which a testing instrument or particular measure serves to assess the underlying theoretical construct ("Validity evidence," 2014). Consequential validity, a construct that is familiar to human resources personnel and social scientists, is a term used to refer to the social consequences of using a certain test for a particular purpose ("Validity evidence," 2014).
Criterion-related validity is the type most commonly of interest in performance and other psychometric tests used to assess the suitability of candidates for employment ("Validity evidence," 2014). The relevant figures for the clerical test include the following: Work sample (T) and work sample (C ) correlations are low and are not significant. The criterion-related validity measures of the clerical test show significant correlations with a negative error rate and a…