A COMPARISON OF RELIABILITY ESTIMATES FROM SINGLE AND DOUBLE ADMINISTRATIONS OF CRITERION-REFERENCED TESTS.

Item

Title: A COMPARISON OF RELIABILITY ESTIMATES FROM SINGLE AND DOUBLE ADMINISTRATIONS OF CRITERION-REFERENCED TESTS.
Identifier: AAI8319794
identifier: 8319794
Creator: SCHAEFER, MARY MILLER.
Contributor: Alan Gross
Date: 1983
Language: English
Publisher: City University of New York.
Subject: Education, Educational Psychology
Abstract: The purpose of this study was to compare three models for determining the reliability of criterion-referenced tests. These models, coefficient kappa (k), Huynh's estimate of k ((')k), and Subkoviak's coefficient of agreement (p(,cs)), were used to examine data from 325 students tested on two occasions with identical items. The effect of five student and test characteristics (test length, cut-off score, student ability, sample size and heterogeneous test content) on the resulting reliability coefficient were determined. All possible combinations of test items for each test length were examined across all analyses. Coefficient k, considered the standard, required data from two test administrations. The other models ((')k and p(,cs)) were developed for use when only data from a single test administration are available. These criterion-referenced reliability coefficients were also compared to norm-referenced coefficients (Kuder-Richardson and test-retest).;Representative mean values ((+OR-) SEM) obtained from the test length analysis for k, (')k and p(,cs) (compound binomial) were .402 (+OR-) .085, .588 (+OR-) .045, and .921 (+OR-) .033, respectively. Similar values were obtained for other analyses. The estimate of k, (')k, modestly overestimated k under all conditions except where test items were heterogeneous. Values obtained for the coefficient of agreement p(,cs) were consistently much larger than k, possibly due to the fact that p(,cs) is not corrected for chance agreement. No consistent relationships between criterion-referenced and norm-referenced coefficients were observed.;The data indicate that when estimating reliability of criterion-referenced tests, (')k, in contrast to p(,cs), serves as a reasonable estimate of reliability as determined by the standard, k.
Type: dissertation
Source: PQT Legacy CUNY.xlsx
degree: Ph.D.
Program: Education

Item sets: CUNY Legacy ETDs

Media: A COMPARISON OF RELIABILITY ESTIMATES FROM SINGLE AND DOUBLE ADMINISTRATIONS OF CRITERION-REFERENCED TESTS.