Open access

Introductory Chapter: Psychometrics

Written By

Sandro Misciagna

Published: 23 August 2023

DOI: 10.5772/intechopen.111830

From the Edited Volume

Psychometrics - New Insights in the Diagnosis of Mental Disorders

Edited by Sandro Misciagna

Chapter metrics overview

75 Chapter Downloads

View Full Metrics

1. Introduction

Psychometrics is a scientific discipline that concerns the theories and construction of models for measurement of psychological data. These models try to establish how psychological latent constructs, such as human intelligence, psychological abilities, or mental disorders, can be measured through use of psychological tests, genetic profiles, or neuroscientific information [1]. This problem is commonly approached by building mathematical measurement models in which latent variables act as a common determinant of a set of observable variables [2]. Latent variables models represent the construct of interest as a latent variable that is the common determinant of a set of scores. Psychometrics involves the formalization of psychological theories and the design of psychological assessment instruments including surveys, scales, and open or closed questionnaires [1]. Psychometricians are usually psychologists with advanced training in psychometrics and measurement theory. However, psychometrics is a highly interdisciplinary field with connections with statistical models, data theory, econometrics, biometrics, measurement theory, and mathematical psychology.

Advertisement

2. Brief history of psychometrics

The birth of psychometrics is generally situated at the end of the nineteenth century when Sir Francis Galton in 1884 created an anthropometric laboratory to determine psychological attributes experimentally [3]. Among the first constructs of interest, he proposed to measure keenness of sight, color sense, and judgment of eyes. Galton, often referred to as the father of psychometrics, attempted to measure such attributes by using a vast variety of tasks, recording performance accuracy as well as reaction times. In his book entitled “Hereditary genius” he described different characteristics that people possess regarding sensory and motor functions such as visual acuity, physical strength, or reaction times. He included among anthropometric measures also mental abilities that could be measured through mental tests.

Francis Galton was probably inspired by Charles Darwin who in 1859 published the book “On the origin of species by means of natural selection” [4]. In this book, Darwin described the role of natural selection in the emergence, over time, of different populations of plants and animals. According to his theory, individuals with more adaptive characteristics were more likely to survive and procreate in certain environments, while individuals with less adaptive characteristics were more likely to go extinct.

Another pioneer in the field of psychometrics was James McKeen Cattel who coined the term “mental tests” and was responsible for the research that led to the development of modern tests [5].

At the same time that Darwin, Galton, and Cattel were making their discoveries, the German educator Johann Friedrich Herbart was interested in discovering the mysteries of human consciousness so he created the first mathematical models of the mind [5]. Inspired by his works, the German physiologist Ernst Heinrich Weber tried to demonstrate the existence of a psychological threshold, arguing that a minimum stimulus was necessary to activate a sensory system. After Weber, the German psychologist Gustav Theodor Fechner devised the law that the strength of a sensation grows as the logarithm of the stimulus intensity.

During the early twentieth century, the interest in measuring human qualities intensified greatly when the US implemented programs to select soldiers using tests that measured a range of abilities relevant to military performance. Such tests produced a great deal of data, which led to questions that inspired the birth of psychometric theory as we currently know regarding in particular the analysis of psychological test data, the properties of psychological tests, and the selection of the best tests suited for a certain purpose. Almost immediately two important properties of the tests were identified [6].

The first property of a test concerns the notion of reliability, which is the question of whether a test produces consistent scores when applied in the same circumstances. One of the first scientists to take interest in this topic was the psychologist and statistician Charles Edward Spearman who wrote in 1904 an article about the theory of measurement reliability [7]. A reliable measure is consistent across time, individuals, and situations that is the question of generalization, from test to test, from examiner to examiner, from situation to situation, or from testing time to testing time [8]. For example, it regards the question of whether an intelligence test produces the same score of intelligence quotient when administered to people with the same level of intelligence. In the article of Spearman, he developed most of the basic statistics about reliability including corrections for attenuation, standard error of measurement, correction of the split half, reliability coefficient for test length, and other statistics that are identified with test reliability [9]. In his classic book “Theory of Mental Tests” written in 1950, Harold Gulliksen extended the simple mathematical models for reliability developed by Spearman and provided an extensive mathematization of reliability based on the concept of parallel tests [10].

The second property of a test concerns the notion of validity, which is the question of whether a test is valid in measuring what it is intended to measure. For example, it regards the question if an intelligence test actually measures intelligence. There are three main types of validity on which the worth of psychological tests is determined, which are predictive validity, content validity, and construct validity. In 1954 well-known experts on test development stated that the predictive validity of a test is its correlation with a criterion [11]. Content validity is a demonstration that the items of a test do an adequate job of covering the domain being measured. For example, many types of tests were developed in the 50s in the civil service, the military, the industry, and schools at all levels of education [11]. Construct validity is related to measures of other constructs as required by the theory. It means that the validity of a test could not be determined by the correlation with a single criterion, but it is necessary to provide numerous relationships with variables with which it logically relates [12]. Other forms of validity are criterion-related validity, which refers to the extent to which a test predicts a sample of behavior, or concurrent validity when the criterion measure is collected being validated.

The concepts of reliability and validity are still today among the most essential elements in the evaluation of quality of any psychological tests. Reliability is necessary but not sufficient for validity. Furthermore, the definition of reliability is widely accepted while the definition of validity is widely contested [13]. The development of reliability theory culminated in the half of the twentieth century with the work of Lord and Novick in 1968 who presented the currently accepted definition of reliability [14]. According to their definition, reliability is a signal-to-noise ratio or better the ratio of true score variance to observed score variance. This concept was somewhat differently conceptualized in different theoretical frameworks. In fact, according to the latent variable theory of Mellenberg formulated in 1994, it was considered a measurement of precision [15], while according to the generalizability theory of Cronbach formulated in 1972, it was considered as a generalizability measure [16]. However, Lord and Novick’s definition typically follows as a special case, which indicates the consistency of the general psychometric framework [14].

Reliability could be estimated in various ways, such as from correlation between two test halves, from the average correlation between test items, and from the correlation between two administrations of the same test at different times [17]. Some discussions are about how to optimally estimate reliability or about the coefficients that should be preferred in a particular contest. For example, consistency over repeated measures of the same test can be assessed with the Pearson correlation coefficient, which is often called test-retest reliability. Internal consistency may be assessed by correlating performances on two halves of a test, which is called split-half reliability. One of the most commonly used indexes of reliability is Cronbach’s α, which is equivalent to the mean of all possible split-half coefficients [18]. Coefficient alpha is a correlation between an existing test and a hypothetical test, under the assumption that (1) the average correlation among items in each of the two tests is the same and (2) the average correlation between items on the two tests is the same as the average correlation within items on each of the tests.

Other approaches include the intra-class correlation, which is the ratio of the variance of measurements of a given target to the variance of all targets.

As regards the validity of a test there are no widely accepted methods to determine whether a test is valid or is to estimate the degree of validity of a psychometric test [19]. The question of whether a test measures what it aims to measure raises from the question about the nature of the psychological constructs themselves. One question on validity of a test concerns if psychological constructs could be measured in the same way or if different ways of measurement should be invoked [20]. Another question concerns if psychological constructs give a realistic interpretation or if they are summaries of data [21]. Another question concerns if we could talk about validity of tests at all or if we should talk about the validity of interpretations of the test scores [13]. These examples of questions demonstrate that the validity of a test is one of the most problematic psychometric concepts.

Psychometric theory and its practice have developed largely even if there was not a definitive answer to these questions. With the spread of psychometrics as a psychological science, it got inspiration from mathematic and statistic concepts for the development of measurement models of psychological data, therefore, becoming a largely technical discipline. Generally, such models contain a psychological construct to be measured, such as the expectations of the observed scores. Statistical models have largely contributed to the development of psychometric theories such as the modern test theory of Rash developed in 1960 [22], the classical test theory of Lord and Novick developed in 1968 [14], the latent class analysis of Lazarsfeld and Henry developed in 1968 [23], the cogeneric model of Jöreskog developed in 1971 [24], and the nonparametric item response model of Mokken developed in 1971 [25]. After the theorization of these models, one of the main topics of the psychometric research became the development of software to fit and estimate them and the development of estimation algorithms [26], software for test analysis [27], and general latent variable modeling [28]. These developments have taken place in the last three decades of the twentieth century.

In 2014, American Psychological Association (APA), the American Educational Research Association (AERA), and the National Council on Measurement in Education (NCME) published a revision of the book “Standards for Educational and psychological testing” for development, evaluation, and use of psychological tests [29]. This book covers topics about test validity, reliability, errors of measurements, test designs, use of scales, score linking, how to establish cut-off scores, test administration, testing application, and interpretation of psychometric tests.

Advertisement

3. Main concepts of the psychometric theory

Psychometric models relate to a set of observed variables by mapping positions in the latent structure to distribution of the observed variables. This is usually done by specifying a conditional distribution function of the observables given the latent structure. Thus, the general framework consists of a simultaneous regression of observed variables on a latent variable or a set of latent variables.

Three principal theoretical models are derivatives of this idea that are: (1) the form of the latent structure according to which there is a continuous line or a set of latent classes [30], (2) the form of a regression function that consists in a step function or a logistic function, and (3) the distribution or density appropriate to the observations such as a binomial distribution or a normal density.

According to the linear common factor model when the latent structure is a unidimensional continuum, the regression function is linear and the observables follow a normal density [24]. Instead, according to the two parametric logistic models, the latent structure is a unidimensional continuum and the regression function is logistic, and the observables follow a binomial distribution [31]. Finally, in the latent class model, the latent structure is categorical and the observed variables are binary [23]. The latent structure consists of a representation of the constructs to be measured such as intelligence, while the observed scores are typically the concrete behavioral responses, such as the items to determine the QI in an intelligence test. Consequently, the psychometric model coordinates the correspondence between the observational and the theoretical terms creating a measurement model [32]. This means that the psychometric model is a measurement model that coordinates theory with observations and not in the sense that human behavior can be successfully analyzed in terms of quantitative laws.

The reliability is related to the psychometric model through the concept of measurement precision, which is inversely related to the variance of the observed scores [15]. Therefore, the higher the variance of the conditional distribution of the observables, the lower the measurement precision of the observables. Measurement precision may not be identical for different positions of the latent structure.

In the linear common factor model, measurement precision is identical for all values of the latent variable. In the Rash model, measurement precision is highest for the latent position since the logistic regression of the observable has its inflection point [22]. The reliability of test scores, as theorized in the classical test theory of Lord and Novick, is an unconditional index of measurement precision [14].

A subfield of psychometrics that plays an important role in the analysis of educational tests is the item response theory [33] in which the observed variables are the responses to test items such as the items in a QI test. In the item response theory, the latent function that specifies the regression of the observed variables is known as an item characteristic curve (ICC). Generally, item response theory assumes models with a unidimensional and continuous latent structure. In educational testing, items are typically scored dichotomously (0: incorrect and 1: correct) and the item characteristic curve is bounded from above and below and is often modeled using a nonlinear function. The slope of the item characteristic curve, at a given point on the latent scale, is proportional to the ability of the item to discriminate between positions above and below and determine the amount of the item information at that point. The result of the item information against the latent variable is the item information function (IIF). The item information function can be used in psychometrics to regulate the selection of items.

This idea is on the bases of adaptive testing, which is becoming more and more increasingly important with the advent of computerized test administrations [34]. In adaptive testing, items are administered sequentially and are selected for administration adaptively. In the process of item administration, examinee ability is estimated on the basis of the item responses given so far. The next item to be administered is instead chosen on the basis of the slope of the item information function at the estimated examinee ability. In this way, tests can be shortened without disturbing the reliability.

One of the features often necessary in psychometric tests is measurement invariance [35]. This is especially required when psychometric tests are used to select individuals based on their attitudes such as in student placement or in job selection. Generally, these selection processes operate on heterogeneous populations with respect to background variables such as age, sex, education, and ethnicity. In these cases, the psychometric test should function in the same way across different subpopulations and should not produce bias in the test scores in a specific group. For example, this bias could arise when an intelligence test contains questions that are easier in subpopulations with a specific background regardless of their intelligence level. For example, in a test that contains general knowledge questions, it could be more difficult for ethnic minorities for reasons independent of their level of intelligence.

An alternative to the latent variable model in the psychometric literature concern the multidimensional scaling model [36]. Multidimensional scaling is a method for finding a simple representation of data with a large number of latent dimensions. Multidimensional scaling is a psychometric tool to infer the number of underlying dimension in proximity data. An example is given by the degree of different facial expressions that are judged to be similar. Metric multidimensional scaling is applied when similarity measures are continuous [37] while nonmetric multidimensional scaling is applied when similarity measures are ordinal [38]. In the multidimensional scaling model, individual differences are weighted according to the underlying dimensions differently across subjects [39]. In this way, each subject receives a different weight for each dimension. An important instance of multidimensional scaling with individual differences is the unfolding analysis in which each subject is assumed to have an ideal point on the dimension underlying the preference data [40]. A stimulus is preferred when it is close to the subject’s ideal point.

Other alternative psychometric models that make a connection between theory and data are: (1) taking the construct to be a universe of behaviors, described by Cronbach in 1972 [16]; (2) representing the construct as a common effect of the observed variables, described by Bollen and Lennox in 1991 [41]; and (3) interpreting the construct as a causal system in which the observables influence each other, described by Cramer in 2010 [42].

Recent advances in psychometrics have focused attention on models that deal with more complex situations. Some extensions include the incorporation and development of multilevel and random effects structures in item response theory models [43], factor models [44], or latent class models [45]. In these models, item parameters may become random variables and psychometric analysis of data is common in large-scale assessments.

Another recent innovation is the use of computerized technologies tests that make available response times in addition to ordinary responses. It enables more refined estimation techniques and new models of assessment, especially for multidimensional item response theory models [46], factor analysis of categorical and nonnormal data [47], cluster analysis, discriminatory analysis, cognitive diagnosis models [48], and nonlinear factor models [49]. The factor analysis [50] is a method for determining the underlying dimensions of data that does not require a consensus on appropriate procedures for determining the number of latent factors [51]. Cluster analysis is a psychometric approach to find objects that are like each other’s. Multidimensional scaling, factor analysis, and cluster analysis are all multivariate descriptive methods used to distill simpler structures from large amounts of data. As factor analysis, discriminatory analysis is a multivariate descriptive statistic that tries to demonstrate that a multiple discriminant function is a special type of factoring, in which the factors are obtained to optimally discriminate among two or more groups of people on the basis of the scores from a set of tests [52]. For example, in a classic paper by Tiedman and colleagues written in 1952, they demonstrated discriminatory analysis with Airman Classification Battery applied to the problem of assigning air force personnel to eight occupational specialties [53]. However, the models underlying discriminant function are more appropriate with respect to noncognitive variables (such as personality) rather than cognitive variables. From a conceptual point of view, recent psychometric literature is focused on the status of psychometric measurement models, the relation between psychometrics and psychology, and the usefulness of Cronbach’s alpha as a measure of reliability [54].

The use of open-source statistical software programs has also enabled psychometricians to develop their own models and share these with other researchers.

Advertisement

4. Psychometric research

4.1 From human sciences to artificial machines

Psychometric experiments studying individual differences are mainly concerned with correlation among variability in response to the same group of subjects to different sources of response elicitation.

As argued by Jum Nunnally, there are three overlapping topics in psychometrics methods [11]. The first psychometrics method is mainly a deductive analysis when it concerns multidimensional scaling, factor analysis, and item analysis. Most of the deductive models are expressed in mathematical models or have mathematical implications. The deductive models in psychometrics have always been closely wedded to basic research on empirical individual differences with respect to achievements, abilities, personality, and other types of human traits [11].

The second psychometrics method is a mathematical method that concerns basic research on individual differences. Examples are psychometric studies that try to determine the structure of specific human abilities [55].

The third psychometrics method concerns the measurement of individual differences in applied settings such as schools, government, military, industry, and other institutions. The application in this setting depends both on the use of deductive models and basic research on human traits [11].

One of the first psychometric instruments was designed to measure human psychological traits, abilities, and characteristics of personality. The historical approach was developed by Alfred Binet and Theodore Simon who made the Stanford-Binet IQ test to measure human intelligence [56]. Subsequently, these tests were revised and other new important tests were developed such as the Wechsler Intelligence Scale for adults and for children (WAIS). Another focus in psychometrics regards personality testing, even if still nowadays there are not widely accepted way of measuring personality since the theoretical construct of personality is a complex idea. Some of the best-known personality instruments include the Minnesota multiphasic personality inventory (also known as MMPI), the Big Five inventory (or Big 5), the Rorschach Inkblot test, the neurotic personality questionnaire (KON 2006) [57], the Eysenk’s personality questionnaire (EPQ-R), the personality and preference inventory, and the Myers-Briggs Type indicator. Numerous test batteries of personality grew out of previous findings from factor analysis; in other cases, factor analysis served to construct subtests of a battery with a small number of factors.

Psychometric approaches have also studied extensively human psychological attitudes, human abilities, and educational evolution. Around 1950s, researchers developed collections of tests with heterogeneous criteria finalized to predict success in a particular job or social activity based on mental attitudes. In fact, they discovered that success could be better predicted by use of a battery composed of tests, each of which was homogeneous with respect to a particular psychological trait. In this area, psychometrics tests are applied in setting for making important decisions about people such as selecting people for pilot training or assigning individuals to different types of treatments. Another example would be in comparing groups of children who have undertaken different types of preschool. In this case, psychometric measures would concern various aspects of achievements in relation to language development. A concise battery of aptitude tests does a good job of predicting school grades and other school performances. Some of the correlations with school grades have ranged up into the 80s and are frequently good indicators for future performances in college. However, the philosophy of education has been influenced by the Skinnerian movement that sustains techniques of behavioral modification. This movement argues that individual initial competencies can be changed after a specific training and then it would be superfluous to apply achievement tests. The answer of the researchers who sustain a psychometric approach is that tests can determine the initial level of competence in order to know where to start in the training program, employ aptitude tests to predict how rapidly the person achieves the competencies, and determine the level of competence that is reached.

With the advent of high-speed computers, researchers developed psychometric hardware that could be useful in helping to solve social problems [58]. This approach could be ideal for distinguishing the natural grouping of people, animals, or material objects on a set of relevant measurements.

More recently psychometrics is also approaching nonhuman abilities such as learning evolution of machines with particular regard in the area of artificial intelligence so some researchers have proposed an integrated approach under the name of universal psychometrics [59].

References

  1. 1. Tabachnick BG, Fidell LS. Using Multivariate Statistics. 4th ed. Boston: Allyn and Bacon; 2001
  2. 2. Sijtsma K. Introduction to the measurement of psychological attributes. Measurement. Journal of the International Measurement Confederation. 2011;44(7):1209-1219
  3. 3. Galton F. Measurement of character. Fortnightly. Review. 1884;36:179-185
  4. 4. Darwin C, Kebler L. On the Origin of Species by Means of Natural Selection, or, the Preservation of Favoured Races in the Struggle for Life. London: J. Murray; 1859
  5. 5. Kaplan RM, Saccuzzo DP. Psychological Testing: Principles, Applications, and Issues. 8th ed. Belmont, CA: Wadsworth, Cengage Learning; 2010
  6. 6. Kelley TL. Interpretation of Educational Measurements. New York: Macmillan; 1927
  7. 7. Spearman C. The proof and measurement of association between two things. The American Journal of Psychology. 1904;15:72-101
  8. 8. Nunnally JC, Berstein IH. Psychometric Theory. 3rd ed. New York: McGraw-Hill; 1994
  9. 9. Nunnally JC. Psychometric Theory. New York: McGraw-Hill; 1967
  10. 10. Gulliksen H. Theory of Mental Tests. New York: Wiley; 1950
  11. 11. Nunnally JC. Psychometric theory - 25 years ago and now. Educational Researcher. 1975;4(10):7-21
  12. 12. Cronbach LJ, Meehl PE. Construct validity in psychological tests. Psychological Bulletin. 1955;52:281-303
  13. 13. Newton PE. Clarifying the consensus definition of validity. Measurement: Interdisciplinary Research & Perspective. 2012;10(1-2):1-29
  14. 14. Lord FM, Novick MR. Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley; 1968
  15. 15. Mellenbergh GJ. A unidimensional latent trait model for continuous item responses. Multivariate Behavioral Research. 1994;29(3):223-236
  16. 16. Cronbach LJ, Gleser GC, Nanda H, Rajaratnam N. The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles. New York: Wiley; 1972
  17. 17. Kuder GF, Richardson MW. The theory of estimation of test reliability. Psychometrika. 1937;2:151-160
  18. 18. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16:297-334
  19. 19. Lissitz RW. The Concept of Validity: Revisions, New Directions, and Applications. Charlotte, NC: Information Age; 2009
  20. 20. Michell J. Measurement in Psychology: A Critical History of a Methodological Concept. Cambridge: Cambridge University Press; 1999
  21. 21. Borsboom D. Measuring the Mind: Conceptual Issues in Contemporary Psychometrics. Cambridge: Cambridge University Press; 2005
  22. 22. Rasch G. Probabilistic Models for some Intelligence and Attainment Tests. Copenhagen: Paedagogiske Institut; 1960
  23. 23. Lazarsfeld PF, Henry NW. Latent Structure Analysis. Mifflin: Houghton; 1968
  24. 24. Jöreskog KG. Statistical analysis of sets of congeneric tests. Psychometrika. 1971;36(2):109-133
  25. 25. Mokken RJ. A Theory and Procedure of Scale Analysis. The Hague: Mouton; 1971
  26. 26. Bock RD, Aitkin M. Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika. 1981;46(4):443-459
  27. 27. Zimowski MF, Muraki E, Mislevy RJ, Bock RD. BILOG-MG: Multiplegroup IRT Analysis and Test Maintenance for Binary Items [Computer Software]. Chicago: Scientific Software International; 1996
  28. 28. Arbuckle JL. IBM SPSS Amos 19 User’s Guide. Chicago, IL: SPSS; 2010
  29. 29. American Educational Research Association. Standards for Educational and Psychological Testing. Apa Org. Direct Other Orders to. 1430 K Street, NW, Suite 1200 Washington, DC 20005: AERA Publications Sales; 2014
  30. 30. Lazarsfeld PF. Latent structure analysis. In: Koch S, editor. Psychology. Vol. III. New York: McGraw-Hill; 1959
  31. 31. Birnbaum A. Some latent trait models and their use in inferring an examinee’s ability. In: Lord FM, Novick MR, editors. Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley; 1968. pp. 397-479
  32. 32. Michell J. Measurement scales and statistics: A clash of paradigms. Psychological Bulletin. 1986;100(3):398-407
  33. 33. Embretson SE, Reise SP. Item Response Theory for Psychologists. Mahwah, NJ: Erlbaum; 2000
  34. 34. Van der Linden WJ, Glas CAW. Eds. Computerized Adaptive Testing: Theory and Practice. Kluwer, Norwell, MA. 2000
  35. 35. Meredith W. Measurement invariance, factor analysis, and factorial invariance. Psychometrika. 1993;58(4):525-543
  36. 36. Davison ML, Sireci SG. Multidimensional scaling. In: Handbook of applied multivariate statistics and mathematical Modeling copyright 2000 by academic press. Cap. 2000;12:323-352
  37. 37. Torgerson WS. Multidimensional scaling: Theory and method. Psychometrika. 1952;17:401-409
  38. 38. Shepard RN. Analysis of proximities: Multidimensional scaling with an unknown distance function. Psychometrika. 1962;27:125-140
  39. 39. Carroll JD, Chang JJ. Individual differences and multidimensional scaling via an N-way generalization of Eckart-young decomposition. Psychometrika. 1970;35:282-319
  40. 40. Coombs C. A Theory of Data. New York: Wiley; 1964
  41. 41. Bollen KA, Lennox R. Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin. 1991;110(2):305-314
  42. 42. Cramer AOJ, Waldorp LJ, van der Maas H, Borsboom D. Comorbidity: A network perspective. The Behavioral and Brain Sciences. 2010;33(2-3):137-193
  43. 43. Fox JP, Glas CAW. Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika. 2001;66(2):271-288
  44. 44. Rabe-Hesketh S, Skrondal A, Pickles A. Generalized multilevel structural equation modelling. Psychometrika. 2004;69(2):167-190
  45. 45. Lenk P, DeSarbo W. Bayesian inference for finite mixtures of generalized linear models with random effects. Psychometrika. 2000;65(1):93-119
  46. 46. Reckase MD. Multidimensional Item Response Theory. London: Springer; 2009
  47. 47. Molenaar D, Dolan CV, de Boeck P. The heteroscedastic graded response model with a skewed latent trait: Testing statistical and substantive hypotheses related to skewed item category functions. Psychometrika. 2012;77:455-478
  48. 48. De la Torre J, Douglas JA. Higher order latent trait models for cognitive diagnosis. Psychometrika. 2004;69(3):333-353
  49. 49. Lee SY, Zhu HT. Maximum likelihood estimation of nonlinear structural equation models. Psychometrika. 2002;67(2):189-210
  50. 50. Ang RP. Development and Validation of the Teacher-Student Relationship Inventory Using Exploratory and Confirmatory Factor Analysis. The Journal of Experimental Education. 2005;74(1):55-74. DOI: 10.3200/JEXE.74.1.55-7
  51. 51. Zwick WR, Velicer WF. Comparison of five rules for determining the number of components to retain. Psychological Bulletin. 1986;99(3):432-442
  52. 52. Fisher RA. The statistical utilisation of multiple measurements. Annals of Eugenics. 1928;8:376-386
  53. 53. Tiedeman DV, Bryan JG, Rulon PJ. Application of the multiple discriminant function to data from the airman classification battery. In: Research Bulletin, 52-37. San Antonio, Texas: Air Training Command, Lackland AFB; 1952
  54. 54. Sijtsma K. On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika. 2009;74(1):107-120
  55. 55. Thurstone LL. Multiple-Factor Analysis. Chicago: University of Chicago Press; 1947
  56. 56. Stern T, Fava M, Wilens T, Rosenbaum J. Massachusetts General Hospital Comprehensive Clinical Psychiatry. 2nd ed. London, New York: Elsevier Health Sciences. 04 Sep 2015:1060. ISBN: 9780323328999. ISBN: 9780323295079
  57. 57. Aleksandrowicz JW, Klasa K, Sobański JA, Stolarska D. KON-2006 Neurotic Personality Questionnaire. Archives of Psychiatry and Psychotherapy. 2009;1:21-22
  58. 58. Edwards AL. The relationship between the judged desirability of a trait and the probability that the trait will be endorsed. Journal of Applied Psychology. 1953;37:90-93
  59. 59. Locurto C, Scanlon C. Individual differences and spatial learning factor in two strains of mice. The Behavioral and Brain Sciences. 1987;112:344-352

Written By

Sandro Misciagna

Published: 23 August 2023