Abstract
We present a class of finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Our model is proposed for confirmatory research settings. Our prior for item parameters is a mixture distribution to accommodate situations where different groups of countries have different measurement operations, while countries within these groups are still allowed to be heterogeneous. A simulation study is conducted that shows that all parameters can be recovered. We also apply the model to real data on the two components of affective subjective well-being: positive affect and negative affect. The psychometric behavior of these two scales is studied in 28 countries across four continents.
Article PDF
Similar content being viewed by others
References
Ansari, A., & Jedidi, K. (2000). Bayesian factor analysis for multilevel binary observations. Psychometrika, 65, 475–496.
Ansari, A., Jedidi, K., & Dube, L. (2002). Heterogeneous factor analysis models: a Bayesian approach. Psychometrika, 67, 49–77.
Béguin, A.A., & Glas, C.A.W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–562.
Bollen, K. (1989). Structural equations with latent variables. New York: Wiley.
Bolt, D.M., Cohen, A.S., & Wollack, J.A. (2001). A mixture item response model for multiple-choice data. Journal of Educational and Behavioral Statistics, 26, 381–409.
Brooks, S.P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational & Graphical Statistics, 7, 434–455.
Celeux, G., Forbes, F., Robert, C.P., & Titterington, D.M. (2006). Deviance information criteria for missing data models. Bayesian Analysis, 1, 651–674.
Cohen, A.S., & Bolt, D.M. (2005). A mixture model analysis of differential item functioning. Journal of Educational Measurement, 42, 133–148.
Cohen, A.S., Kim, S.H., & Wollack, J.A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20, 15–26.
De Boeck, P. (2008). Random item IRT models. Psychometrika, 73, 533–559.
De Jong, M.G., Steenkamp, J.-B.E.M., & Fox, J.-P. (2007). Relaxing measurement invariance in cross-national consumer research using a hierarchical IRT model. Journal of Consumer Research, 34, 260–278.
Diener, E., Diener, M., & Diener, C. (1995). Factors predicting the subjective well-being of nations. Journal of Personality and Social Psychology, 69, 851–864.
Diener, E., Suh, E., Lucas, R.E., & Smith, H.L. (1999). Subjective well-being: three decades of progress. Psychological Bulletin, 125, 276–302.
Diener, E., Oishi, S., & Lucas, R.E. (2003). Personality, culture, and subjective well-being: Emotional and cognitive evaluations of life. Annual Review of Psychology, 54, 403–425.
Fox, J.-P. (2005). Multilevel IRT using dichotomous and polytomous Items. British Journal of Mathematical and Statistical Psychology, 58, 145–172.
Fox, J.-P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269–286.
Fox, J.-P., & Glas, C.A.W. (2003). Bayesian modeling of measurement error in predictor variables using item response theory. Psychometrika, 68, 169–191.
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (2004). Bayesian data analysis. New York: Chapman & Hall.
Goldstein, H. (2003). Multilevel statistical models. London: Oxford University Press.
Goldstein, H., Bonnet, G., & Rocher, T. (2007). Multilevel structural equation models for the analysis of comparative data on educational performance. Journal of Educational and Behavioral Statistics, 32, 252–286.
Hofstede, G.H. (2001). Culture’s consequences: comparing values, behaviors, institutions, and organizations across nations (2nd ed.). Thousand Oaks: Sage.
Hoijtink, H., Rooks, G., & Wilmink, F.W. (1999). Confirmatory factor analysis of items with a dichotomous response format using the multidimensional Rasch model. Psychological Methods, 4, 300–314.
Johnson, T.R. (2003). On the use of heterogeneous thresholds ordinal response models to account for individual differences in response style. Psychometrika, 68, 563–583.
Jöreskog, K.G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 32, 443–482.
Kamman, R., & Flett, R. (1983). Sourcebook for measuring well-being with affectometer 2. Dunedin: Why Not? Foundation.
King, G., Murray, C.J.L., Salomon, J.A., & Tandon, A. (2003). Enhancing the validity of cross-cultural comparability of measurement in survey research. American Political Science Review, 98(1), 191–207.
Lee, S.-Y. (2007). Structural equation modelling: a Bayesian approach. London: Wiley.
Lenk, P.J., & DeSarbo, W.S. (2000). Bayesian inference for finite mixtures of generalized linear models with random effects. Psychometrika, 65(1), 93–119.
Longford, N.T. (1993). Random coefficient models. New York: Oxford University Press.
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillside: Erlbaum.
Lubke, G.H., & Muthén, B.O. (2004). Applying multigroup confirmatory factor models for continuous outcomes to Likert scale data complicates meaningful group comparisons. Structural Equation Modeling, 11, 514–534.
Lyubomirsky, S., King, L., & Diener, E. (2005). The benefits of frequent positive affect: does happiness lead to success. Psychological Bulletin, 131, 803–855.
May, H. (2006). A multilevel Bayesian IRT method for scaling socioeconomic status in international studies of education. Journal of Educational and Behavioral Statistics, 31, 63–79.
McCrae, R.R., & Terracciano, A. (2005). Universal features of personality traits from the observer’s perspective: data from 50 cultures. Journal of Personality and Social Psychology, 88, 547–561.
McLachlan G., & Peel, D. (2000). Finite mixture models. New York: Wiley.
Meade, A.W., & Lautenschlager, G.J. (2004). A comparison of item response theory and confirmatory factor analytic methodologies for establishing measurement equivalence/invariance. Organizational Research Methods, 7, 361–388.
Mellenbergh, G.J. (1994). Generalized linear item response theory. Psychological Bulletin, 115, 300–307.
Meredith, W. (1993). Measurement invariance, factor analysis, and factorial invariance. Psychometrika, 58, 525–543.
Millsap, R.E. (1995). Measurement invariance, predictive invariance, and the duality paradox. Multivariate Behavioral Research, 30(4), 577–605.
Millsap, R.E. (1997). Invariance in measurement and prediction: their relationship in the single-factor case. Psychological Methods, 2(3), 248–260.
Millsap, R.E. (2008). Invariance in measurement and prediction revisited. Psychometrika, 72, 461–473.
Millsap, R.E., & Kwok, O.-M. (2004). Evaluating the impact of partial factorial invariance on selection in two populations. Psychological Methods, 9, 93–115.
Millsap, R.E., & Yun-Tein, J. (2003). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39, 479–515.
Newton, M.A., & Raftery, A.E. (1994). Approximate Bayesian inference with the weighted likelihood bootstrap. Journal of the Royal Statistical Society: Series B (Methodological), 56, 3–48.
Rabe-Hesketh, S., & Skrondal, A. (2007). Multilevel and latent variable modeling with composite links and exploded likelihoods. Psychometrika, 72, 123–140.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004). Generalized multilevel structural equation modeling. Psychometrika, 69, 167–190.
Raju, N.S. (1990). Determining the significance of estimated signed and unsigned areas between two item response functions. Applied Psychological Measurement, 14, 197–207.
Raju, N.S., Laffitte, L.J., & Byrne, B.M. (2002). Measurement equivalence: a comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87, 517–529.
Reise, S.P., Widaman, K.F., & Pugh, R.H. (1993). Confirmatory factor analysis and item response theory: two approaches for exploring measurement invariance. Psychological Bulletin, 114, 552–566.
Rijmen, F., Tuerlinckx, F., De Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8, 185–205.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph, 17, 1–100.
Scheines, R., Hoijtink, H., & Boomsma, A. (1999). Bayesian estimation and testing of structural equation models. Psychometrika, 64, 37–52.
Schimmack, U., Radhakrishnan, P., Oishi, S., Dzokoto, V., & Ahadi, S. (2002). Culture, personality, and subjective well-being: Integrating process models of life satisfaction. Journal of Personality and Social Psychology, 82, 582–593.
Sinharay, S., Johnson, M.S., & Stern, H.S. (2006). Posterior predictive assessment of item response theory models. Applied Pychological Measurement, 30, 298–321.
Snijders, T.A.B., & Bosker, R.J. (1999). Multilevel analysis: an introduction to basic and advanced multilevel modeling. London: Sage.
Song, X.-Y., & Lee, S.-Y. (2004). Bayesian analysis of two-level nonlinear structural equation models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 57, 29–52.
Spiegelhalter, D.J., Best, N.G., Carlin, B.P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society B, 64(10), 583–639.
Stark, S., Chernyshenko, O.S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: toward a unified strategy. Journal of Applied Psychology, 91, 1292–1306.
Steenkamp, J.-B.E.M. (2005). Moving out of the US silo: a call to arms for conducting international marketing research. Journal of Marketing, 69, 6–8.
Steenkamp, J.-B.E.M. & Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25, 78–90.
Tanner, M.A., & Wong, W.H. (1987). The calculation of posterior distributions by data Augmentation. Journal of the American Statistical Association, 82, 528–550.
Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer & H.I. Braun (Eds.), Test validity (pp. 147–169). Hillsdale: Erlbaum.
Titterington, D.M., Smith, A.E.M., & Makov, U.E. (1985). Statistical analysis offinite mixture distributions. New York: Wiley.
Van de Vijver, F.J.R., & Leung, K. (1997). Methods and data analysis for cross-cultural research. London: Sage.
Vandenberg, R.J., & Lance, C.E. (2000). A review and synthesis of the measurement invariance literature: suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4–69.
Vermunt, J. (2008). Latent class and finite mixture models for multilevel datasets. Statistical Methods in Medical Research, 17, 33–51.
Watson, D., & Clark, L.A. (1991). Self-versus peer-ratings of specific emotional traits: evidence of convergent and discriminant validity. Journal of Personality and Social Psychology, 60, 927–940.
Wolfe, R., & Firth, D. (2002). Modeling subjective use of an ordinal response scale in a many period crossover experiment. Applied Statistics, 51(2), 245–255.
Zumbo, O., & Bruno, D. (2007). Three generations of DIF analyses: considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4, 223–233.
Zwick, R., & Thayer, D.T. (1996). Evaluating the magnitude of differential item functioning in polytomous items. Journal of Educational and Behavioral Statistics, 21, 187–201.
Author information
Authors and Affiliations
Corresponding author
Additional information
We thank AiMark for providing the data, and Roger Millsap, Bengt Muthén, and the anonymous reviewers for extremely valuable comments.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
de Jong, M.G., Steenkamp, JB.E.M. Finite Mixture Multilevel Multidimensional Ordinal IRT Models for Large Scale Cross-Cultural Research. Psychometrika 75, 3–32 (2010). https://doi.org/10.1007/s11336-009-9134-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-009-9134-z