Binary Data Factor Analysis and Multidimensional Latent Trait/Item Response Theory (IRT) Models


Overview

Several advanced methods are available for factor analysis of binary data, including:

  1. Full-information maximum-likelihood estimation of a normal-ogive (Gaussian) multidimensional latent trait/IRT model (Bock, Gibbons & Muraki, 1988).

  2. Factor analysis of the tetrachoric correlations between all item pairs (Knol & Berger, 1991).

  3. The LISCOMP method (Muthen, 1978).

  4. Nonlinear factor analysis (McDonald, 1982).

(These do not include methods based on logistic-ogive or Rasch models, with which I am less familiar.)

Methods 1--3 are theoretically similar; all assume (a) the dichotomous manifest variables are discretized versions of latent continuous variables; and (b) the underlying continuous variables have a multivariate normal distribution. I don't know much about Method 4, but it appears related to the other three methods and, if so, might be expected to produce similar results.

Knol and Berger (1991; also see Parry & Mcardle, 1991) compared methods and basically found that factoring tetrachoric correlations worked as well as other methods. This is helpful since commonly available software such as PRELIS (distributed with LISREL) can be used to calculate a matrix of tetrachoric correlations, and, say, SAS PROC FACTOR can be used to factor the matrix. For a full explanation of this method, including examples, click here.

Besides the methods above, Uebersax (1993) described another approach: first one performs a latent class analysis of the data; then one locates the latent classes in a multidimensional space. This is potentially useful when (a) the assumption of latent multivariate normality is inappropriate; or (b) one wishes to consider the group (latent class) structure of cases as well as data dimensionality.

Recommended Readings

The book by Bartholomew is very helpful; it devotes two chapters to the subject and is perhaps the best summary available. The Knol and Berger (and the Parry & McArdle paper, which is similar) gives a good empirical comparison of different methods. The Takane and de Leeuw paper--more technical and not for every reader--rigorously examines the relationships between different approaches.

    Bartholomew DJ. Latent variable models and factor analysis. New York: Oxford University Press, 1987.

    Bock RD, Gibbons R, Muraki, E. Full-information item factor analysis. Applied Psychological Measurement, 1988, 12, 261-280.

    Knol DL, Berger MP. Empirical comparison between factor analysis and multidimensional item response models. Multivariate Behavioral Research, 1991, 26, 457-477

    Muthen, B. Contributions to factor analysis of dichotomized variables. Psychometrika, 1978, 43, 551-560.

    Parry CD, McArdle JJ. An applied comparison of methods for least-squares factor analysis of dichotomous variables. Applied Psychological Measurement, 1991, 15, 35-46

    Takane Y, de Leeuw J. On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 1987, 52, 393-408.

Software

Following are programs I know of for factor analysis of binary data and/or multidimensional latent trait modeling.


      TESTFACT (D. T. Wilson, R. Wood, R. D. Gibbons)

      Available from:

      • Assessment Systems Corporation
      • Scientific Software International
      • ProGAMMA (Netherlands)

      (see end of this section for distributor contact information)

      With TESTFACT, the user can choose either factoring of tetrachoric correlations or full-information maximum-likelihood estimation. TESTFACT will also calculate factor scores.

      The ProGAMMA site lists the latest version (TESTFACT 3), but probably the other distributors listed above also have the latest version.

      For an online description, check the ProGAMMA web site or the Assessment Systems Corporation web site.


      MicroFACT (Niels G. Waller)

      Available from:

      • Assessment Systems Corporation
      • ProGAMMA (Netherlands)

      MicroFACT appears to work by factoring tetrachoric correlations. For an online description, check the ProGAMMA web site or the Assessment Systems Corporation web site.


      Mplus (Bengt and Linda Muthen)

      Available from:

      • Muthen & Muthen

      This possibly replaces the earlier program, LISCOMP, which estimates the dichotomous/polytomous data factor analysis models described by B. Muthen. (Mplus estimates a wide range of other latent variable models as well.)


      NOHARM (Colin Fraser)

      NOHARM (Fraser, 198?) can be used to estimate unidimensional and multidimensional latent trait (IRT) models. For more information, one might check with Jack McArdle at jjm@virginia.edu . He used to have the program available by ftp.


      PRELIS (Karl Joreskog and Dag Sorbom)

      Available from:

      • Scientific Software International
      • Assessment Systems Corporation
      • ProGAMMA (Netherlands)

      PRELIS will calculate tetrachoric and polychoric correlations. These can be output and factor-analyzed to estimate a unidimensional or multidimensional latent trait/IRT model. PRELIS is usually supplied along with LISREL and is widely available. (See below for distributor contact information.)


Software distributor contact information:

      Assessment Systems Corporation
      2233 University Ave, Suite 200
      St. Paul, MN 55114
      United States
      Tel: (651) 647-9220
      Fax: (651) 647-0412
      Web: www.assess.com
      Email: info@assess.com

      Muthen & Muthen
      11965 Venice Blvd, Suite 407
      Los Angeles, CA 90066
      United States
      Tel: (310) 391-9971, Toll Free (888) 814-9144
      Fax: (310) 391-8971
      Web: www.statmodel.com
      Email: sales@statmodel.com

      ProGAMMA bv
      PO Box 841 (mailing address?)
      9700 AV Groningen
      Grote Rosensraat 15 (street address?)
      9712 TG Groningen
      Tel: +31 50 3636900
      Fax: +31 50 3636687
      Web: www.gamma.rug.nl
      Email: gamma.post@gamma.rug.nl

      Scientific Software International
      7383 N Lincoln Ave, Suite 100
      Lincolnwood, IL 60712-1704
      United States
      Tel: (800) 247-6113 or (847) 675-0720
      Fax: (847) 675-2140
      Web: www.ssicentral.com
      Email: sales@scicentral.com


Bibliography

    Bartholomew, D. J. Factor analysis for categorical data (with discussion). J Royal Statist Soc, B. 1980, 42, 293-321.

    Bartholomew, D. J. Latent variable models for ordered categorical data. Journal of Econometrics, 1983, 22, 229-243.

    Bartholomew, D. J. Latent variable models and factor analysis. New York: Oxford University Press, 1987.

    Bock, R. D., and Aitkin, M. Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 1981, 46, 443-459.

    Bock, R. D., Gibbons, R., and Muraki, E. Full-information item factor analysis. Applied Psychological Measurement, 1988, 12, 261-280.

    Christoffersson, A. Factor analysis of dichotomized variables. Psychometrika, 1975, 40, 5-32.

    Fraser, C. (198?). NOHARM II: A FORTRAN program for fitting unidimensional and multidimensional normal ogive models of latent trait theory. Center for Behavioral Studies, the University of New England, Armidale, NSW, Australia"

    Knol DL, Berger MP. (1991). Empirical comparison between factor analysis and multidimensional item response models. Multivariate Behavioral Research, 26, 457-477

    McDonald, R. P. Linear versus non-linear models in item response theory. Applied Psychological Measurement, 1982, 6, 379-396.

    McDonald, R. P. Unidimensional and multidimensional models for item response theory. In D. J. Weiss (Ed.), Proceedings of the 1982 Item Response Theory and Computerized Adaptive Testing Conference. Minneapolis: University of Minnesota, 1985.

    McDonald RP. (incomplete reference: author wrote a book circa 1980's on the subject of latent trait/item response models and binary data factor analysis).

    Mislevy, R. J. Recent developments in the factor analysis of categorical variables. Journal of Educational Statistics, 1986, 11, 3-31.

    Muthen, B. Contributions to factor analysis of dichotomized variables. Psychometrika, 1978, 43, 551-560.

    Muthen, B. A structural probit model with latent variables. Journal of the American Statistical Association, 1979, 24, 807-811.

    Muthen, B., & Christoffersson, A. Simultaneous factor analysis of dichotomous variables in several groups. Psychometrika, 1981, 46, 407-419.

    Muthen, B. A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 1984, 49, 407-419.

    Muthen, B. Dichotomous factor analysis of symptom data. Sociological Methods and Research, 1989, 18, 19-65.

    Parry CD, McArdle JJ. (1991). An applied comparison of methods for least-squares factor analysis of dichotomous variables. Applied Psychological Measurement, 15, 35-46

    Takane Y, de Leeuw J. On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 1987, 52, 393-408.

    Uebersax JS. On the dimensionality of a latent class analysis solution. Paper presented at the annual meeting of the Classification Society of North America, Pittsburgh, PA, 1993.


Go to Latent Structure Analysis
Go to Statistical Methods for Rater Agreement

Revised: 8 July 2000

(c) 2000-2007 John Uebersax PhD    email