Reliability Studies

Reliability Studies
Study Sample: 1568 assessment interviews rated by ASLPI evaluators
  • 1286 test candidates between 2008 and 2011
  • 82 re-rated interviews
Inter and intra-rater Reliability - measure of consistency between ratings of the same test taker from different evaluators. This is an "intra-class correlation" (ICC) for a single rater or the average rating across 3 raters.

Correlation Across All Possible Evaluator Pairs
  • Findings support the reliability of ASLPI ratings
  • Evaluators consistently rank-ordered interviewees for total scores, for holistic ratings, and for dimension scores
  • Evaluators rated performances consistently in terms of relative position on the ASLPI scale
Correlation Across All Possible Evaluator Pairs
Total Score .90
Holistic Rating .90
Dimensions
Vocabulary .82
Grammar .82
Comprehension .86
Accent/Production .80
Fluency .81

Repeatability of ASLPI Ratings
  • Findings supported the repeatability of ASLPI ratings using the adjacent agreement standard of ratings/scores which is the current standard
  • When re-rating ASLPI interviews, evaluators' re-ratings agreed with the initial ratings, and met the adjacent standard
Agreement (+/- 1) [Ref 90-100%]
Production 98%
Grammar 98%
Vocabulary 98%
Comprehension 98%
Holistic Rating 87%

ASLPI Re-Rating
  • Findings supported that a new panel of evaluators and the original panel of evaluators resulted in reliable ratings
Comparison n Final Rating Reliability
Person r Spearman Rho
New panel (3 evaluators) 82 .89 .88
Original panel (3 evaluators) 81 .92 .90