A technique for the treatment of missing data in a nonlinear regression model.
Item
-
Title
-
A technique for the treatment of missing data in a nonlinear regression model.
-
Identifier
-
AAI8821121
-
identifier
-
8821121
-
Creator
-
Shulman, Vivian Gross.
-
Contributor
-
Adviser: Alan L. Gross
-
Date
-
1988
-
Language
-
English
-
Publisher
-
City University of New York.
-
Subject
-
Psychology, Psychometrics | Education, Tests and Measurements
-
Abstract
-
A problem occurs in educational practice when an organization wishes to validate a test, x, as a predictor of criterion variable, y, and the data sets are incomplete. Often, due to the selection of subjects on the basis of x, criterion (y) scores are available for some subjects, but missing for others. A statistical problem of interest is to estimate the missing y scores in an attempt to infer the relationship between x and y in the total group. Regression techniques for handling this problem assume a linear regression model. The problem is exacerbated in the fairly frequent case when the regression of y on x is nonlinear in form.;The primary goal of this research was to analytically investigate the effectiveness of three regression techniques for estimating missing y scores, when the underlying model was nonlinear, and specifically quadratic in form. Method 1 simply utilized the x scores in the selected group to predict missing y scores. Method 2 utilized an auxiliary variable, z, in conjunction with x to predict missing y values. And method 3, a special case of method 2, utilized x and x{dollar}\sp2{dollar} to predict missing y values. Expressions to compare the expected mean squared error of each of the three regression procedures were analytically derived. These expressions were compared in terms of sample size, the proportion of cases selected, the distribution of x scores, the relationship of y to x, and the relationship of z to x. The findings of the present study indicate that first, as expected, in the case where the underlying xy relationship is linear, the simplest regression method (i.e., utilizing the selected x cases alone) performs best in predicting the missing y cases. Second, in a situation, where the relationship between x and y is assumed to be non-linear, the utilization of an additional variable in conjunction with x is the method of choice in predicting missing y cases. Finally, in a situation where the range of x values is severely restricted, the performance of all three procedures is unreliable, and the performance of procedure 3 is especially poor. Recommendations for researchers, and potential areas for future research are discussed.
-
Type
-
dissertation
-
Source
-
PQT Legacy CUNY.xlsx
-
degree
-
Ph.D.