THRESHOLDS, PRINCIPAL COMPONENTS AND REPARAMETERISATION

The points presented below provide an overview of the main points arising from a recent publication examining the nature and justification of the estimation methods employed by RUMM2030 for conducting a Rasch analysis .  This paper provides a lucid presentation of the estimation concepts and the relationship between the threshold and principal component parameterisations of the Rasch Model.

Andrich, D. & Luo, G. (2003).  Conditional Pairwise Estimation in the Rasch Model for Ordered Response Categories using Principal Components. Journal of Applied Measurement, 4(3), 205-221.

*************************

Question:

What parametrisation method  is employed for item estimation in RUMM2030?

Explanation:

A reparameterised form of thresholds into their principal components is the method of estimation operationalised in RUMM2030.  This notion of principal components is used in the sense of Guttman (1950), who rearranged ordered categories into successive principal components, beginning with the usual linear one.  They are analogous to the use of orthogonal polynomials in regression where the independent variable is ordered.  The term does NOT refer to the common “principal components analysis” in which a matrix of correlation coefficients is decomposed by analogy to factor analysis.

The estimates of the principal components are obtained using items taken in pairs, and capitalise on the sufficiency property of the Rasch Model by eliminating the person parameters while estimating the item parameters.  From the estimates of all principal components, the required threshold estimates are calculated readily.  The method immediately accounts for missing data and readily generalises to the case of different numbers of categories for different items.

Guttman, L. (1950). The principal components of scale analysis. In S.A. Stouffer, L. Guttman, E.A. Suchman, P.F. Lazarsfeld, S.A. Star and J.A. Clausen (Eds.), Measurement and Prediction, pp.312-361. New York: Wiley.

*************************

Question:

What is the advantage in using the Principal Components estimation procedure?

Explanation:

The key property of the Principal Components estimation algorithm is that the relevant statistic for estimating each threshold is a function of the frequencies of ALL response categories rather than only a function of the frequency of the corresponding category.  This property should enhance the stability and robustness of the estimates, especially when there might be relatively few cases in some categories for some items.  It is expected, as in the dichotomous case, that the estimates of the principal components of the item parameters in the pairwise conditional maximum likelihood estimates are also consistent.

A key ingredient in the item estimation algorithm are the category coefficients which, along with the successive integer category counters [starting from zero, for the dichotomous case, up to one less than the number of categories], specify the nature of the coefficients of the principal components of the Rasch Model for ordered categories (Andersen, 1977; Andrich, 1978).  It is the category coefficients that provide the link between the Principal Component and Threshold reparameterisations of the Rasch Model.

Andersen, E. B. (1977).  Sufficient statistics and latent trait models. Psychometrika, 42, 69-81.

Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-574.

*************************

Question:

What is the relationship between the location and threshold estimates?

Explanation:

In the RUMM2030 algorithm for item estimation:

• the item location is distinquished from the thresholds.

• two separate constraints are used, one constrains the sum of the location estimates to zero and the other constrains the sum of the threshold estimates to be zero.

• the threshold estimates produced by the RUMM2030 algorithm are referred to as centralised thresholds as they are mean deviated from the location estimate.

• the set of threshold estimates used in most displays within RUMM2030, especially when mapping the item estimates onto the variable or measurement line, are referred to as  uncentralised thresholds as they incorporate the location estimate and are derived by adding the location estimate to each centralised threshold respectively.

• the mean of the set of uncentralised thresholds for an item is the location estimate for that item.

The location parameter is the first principal component of the thresholds as related to Guttman's (1950) work on principal components with ordered categories (Andrich, 1985).  This parameter is always present as a minimum of two ordered categories [the dichotomous case] must be present in any analysis.

The category coefficient of the location parameter is linear in the successive integer category counter values.

Andrich, D. (1985). An elaboration of Guttman scaling with Rasch models for measurement. In N. Brandon-Tuma (Ed.), Sociological Methodology, San Francisco, Jossey-Bass. (Chapter 2, pp. 33-80.).

*************************

Question:

What is the meaning of the terms spread, skewness and kurtosis as they appear in the RUMM2030 displays?

Explanation:

With at least three ordered categories it is possible to construct a second principal component  (Andrich, 1985).  This component is identified in RUMM2030 as spread and:

• is the half distance between the thresholds when the threshold distances are taken to be equal.

• has a category coefficient that is quadratic in the successive integer category counter values.

• characterises the unit of measurement for the scale under construction.

With at least four ordered categories it is possible to construct a third principal component  (Andrich, 1985).  This component is identified in RUMM2030 as skewness and:

• identifies any deviation from an equidistance between successive thresholds.

•  has a category coefficient that is cubic in the successive integer category counter values.

• characterises the skewness of the thresholds.

A fourth principal component can be derived (Pedler, 1987) if at least five ordered categories are present.  This component is identified in RUMM2030 as kurtosis and:

• has a category coefficient that is quartic in the successive integer category counter values.

• characterises the kurtosis of the thresholds.

Pedler, P. (1987).  Accounting for psychometric dependence with a class of latent trait models. Unpublished Ph.D. thesis, Department of Education, The University of Western Australia.

With the estimation of the principal component parameters, the category coefficients are readily constructed depending on the number of components, and equally readily,  the thresholds can be calculated.

*************************

Question:

What are the sufficient statistics displayed in RUMM2030?

Explanation:

The set of sufficient statistics for each item as displayed in RUMM2030 are derived from the respective category coefficients for each of the principal component parameters.

************************* 