0% found this document useful (0 votes)

17 views17 pages

The Seductive Beauty of Latent Variable Models

William Revelle critiques the reliance on latent variable models in psychology, arguing that their mathematical appeal has led to a neglect of alternative models and a focus on internal consistency over validity. He suggests that this trend has misdirected psychological measurement and theory, emphasizing the need for a more critical examination of these models. The article calls for a reevaluation of the assumptions underpinning latent variable approaches in the field of psychology.

Uploaded by

lizvera2315

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views17 pages

The Seductive Beauty of Latent Variable Models

Uploaded by

lizvera2315

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Personality and Individual Differences 221 (2024) 112552

Contents lists available at ScienceDirect

Personality and Individual Differences

journal homepage: www.elsevier.com/locate/paid

The seductive beauty of latent variable models: Or why I don't believe in

the Easter Bunny☆
William Revelle
Department of Psychology, Northwestern University, Evanston, IL 60208, United States of America

A R T I C L E I N F O A B S T R A C T

Keywords: Seduced by their mathematical beauty, psychologists have been using latent variable models for more than a
Latent variables century. Whether discussing a general factor of cognitive ability, personality, or psychopathology there has been
Reliability an unfortunate tendency to reify hierarchical structures without examining the utility of alternative models. To
Validity
some of us, the use of latent variables was an unfortunate mistake. By emphasizing internal consistency rather
Massively Missing Completely at Random
(MMCAR)
than validity, parsimony of fit rather than function, the use of latent variables has led psychological measurement
Scale construction and theory down a beautifully seductive garden path rather than focusing on the real problem of actually being
Factor analysis useful. I will address some of these alternatives and suggest that it is time to think more critically of the use of
Item analysis latent variable models in our theorizing and applications.
Open source

To receive an award for a lifetime contribution to the study of in psychology.

dividual differences is a great honor and an opportunity to review the The second winner of this award was Arthur Jensen whose emphasis
history and prognosticate on the future of our field. To do so, I am not was upon the ‘g’ factor of cognitive ability as a higher level latent var
going to talk about my work so much as challenge a basic assumption iable that could organize and explain the structure of cognitive ability
that we as a field have been making for the past 80 years, and that is the (Jensen, 1998). Jensen emphasized the g factor of cognitive ability in
belief in the power of construct validity and of latent variables. To terms of the effect of early childhood interventions (Jensen, 1969). From
challenge latent variable models at an ISSID meeting or in its journal is a a psychometric point of view, his discussion of what makes a good g
daunting (foolish?) task and seems to fly in the face of the amazing remains an essential example of a higher order factor structure (Jensen
contributions of the three prior winners of this award. For all three of & Weng, 1994).
them, Hans Eysenck, Arthur Jensen, and Ian Deary were leaders in Ian Deary (2001) remains a leader in intelligence research, with his
promoting the power of latent variable models and the theoretical collaborators on the MidLothian study of cognition over the life span
richness that involved. (Deary, 2009; Johnson et al., 2010). He is both a critic and a supporter of
Hans Eysenck, as a student of Cyril Burt, searched for the latent factorial models of cognition. He brought back (Bartholomew et al.,
variables of personality. One of his earliest studies was of the factor 2009) the concept of sampling theory (Thomson, 1935) as a plausible
structure of behavioral measures among hospitalized soldiers (Eysenck, alternative to the hierarchical factor structure so beloved by Spearman.
1944), subsequent publications continued in this tradition as he married
the power of factor analytic techniques to the study of structure and 1. Latent variables
dynamics of personality (Eysenck, 1952, 1967; Eysenck & Eysenck,
1985; Eysenck & Himmelweit, 1947). Besides founding the International All three of these researchers worked in the grand tradition of psy
Society for the Study of Individual Differences he also founded its flag chometrics and made use of factor analytic techniques. These techniques
ship journal, Personality and Individual Differences. Indeed it was reading go back to 1904 with the amazing insights of Charles Spearman. In his
his popular publications emphasizing factor analysis and other quanti two influential papers written while a graduate student of Wundt in
tative techniques (Eysenck, 1953, 1964, 1965) that led me to study Leipzig, Spearman translated the correlation coefficient from the in
personality as a way to combine my interests in mathematics and sights of Galton (1888) and the mathematics of Bravais (1844) and

☆
Based upon the Distinguished Contribution Lecture to the International Society for the Study of Individual Differences, July 2023.
E-mail address: [email protected].

https://doi.org/10.1016/j.paid.2024.112552

Available online 29 January 2024

0191-8869/© 2024 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-
nc/4.0/).
W. Revelle Personality and Individual Differences 221 (2024) 112552

subsequently Pearson (1896) to be understandable to psychologists

k σ 2X − Σ σ 2i kci
(Spearman, 1904b). In a second article in the same journal, he further λ3 = α = = . (5)
developed the basic concepts of reliability, and laid the foundations for k− 1 σ 2X 1 + (k − 1)ci
factor analysis (Cudeck & MacCallum, 2007; Spearman, 1904a). If the interitem covariances are found then λ3 = α are functions of the
Spearman emphasized the distinction between observed (manifest) average interitem covariance (ci ) and the number of items (k).
and true (latent) correlations and showed how “correcting” for the Why are these various equations relevant? Eq. (2) suggests that items
attenuation due to unreliability (Spearman, 1904a) converted observed are made up of a latent true score and error and because errors are
correlations (rp′q′ ) into estimates of the “true” correlation (rpq ) between thought to be uncorrelated, aggregating items increases the internal
various measures of cognitive ability. consistency of the test (Eq. (5)).
rp′q′ With the assumption that items were very noisy Eq. (2) led to the
rpq = √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ (1) tendency to emphasize aggregating items and using a test's internal
rp′ p′ rq′ q′
1 2 1 2
consistency as an index of factorial validity. Items were thought to be
This insight of correcting for attenuation and searching for a com composed on one true factor and error. This belief was supported by the
mon factor was used by Webb (1915) in his amazing analysis of ability relatively low correlations of items with each other, suggesting that the
and character (finding factors of a 45 × 45 correlation matrix by hand common variance was low and the error was large. But this ignored the
was a monumental effort.) surprisingly high test-retest correlations of items even over several
Although not referring to it, Spearman's use of manifest and latent weeks. For instance, when examining the 9 items of the Impulsivity
correlations is reminiscent of Plato's Allegory of the Cave (Plato, n.d.). subscale from the EPI (Eysenck & Eysenck, 1964) in the epiR data set in
Manifest variables are equivalent to shadows cast on the wall of the cave the psychTools package for R the inter-item correlation is just 0.11 but
by people moving in front of a fire. This metaphor is useful when we the average test-retest correlation over several weeks is 0.52. (These
consider the effect attributed to Flynn (1984, 1987) by Herrnstein and items are dichotomous. If we find the average tetrachoric values they are
Murray (2010) of manifest intelligence scores increasing by 0.3 sd per 0.19 inter-item and 0.74 for test retest.) This pattern of higher test-retest
decade which could be seen as analogous to a change in shadow length interitem correlations is also true even for a presumably better set of
as people move closer to the fire. That is, manifest variables can change items (the items measuring Neuroticism) with average inter-item cor
over time with no real change in latent scores. relations of 0.15, but test-retest correlations also averaging 0.52 (0.27
Spearman's main use of latent variables was to show that the corre and 0.74 for tetrachorics). Similar findings have been reported for 100
lations between a number of cognitive abilities showed a remarkable items of the HEXACO with item test-retest correlations over 13 days
consistency which suggested a latent common factor. This was the having a mean value of 0.65 (Henry et al., 2022). In an unusual design
introduction of factor analysis as well as test theory. The basic idea was Condon (2022) reports that the stability of items over 15 min with 143
that each observed score reflects a common factor and a specific factor as intervening items between 0.6 and 0.7 for most items. All of these
well as some error. In modern notation this is findings suggest that the unique variance of an item is much more stable
than previously thought and that aggregating them leads to more than
X = λ′i θi + §i + ε (2) just a pure factor measure for it also includes some of the unique but
stable item variance.
where X is an observed score, λi is the correlation of the general factor
with a specific item, θi is the latent value of an item, §i is the item specific 1.1. Common factor analysis
factor, and ε is a random disturbance. Subsequent work by Thurstone
(1934, 1935) introduced matrix algebra to Spearman's tables, and At the data level, the basic equation for the factor model is that
generalized the single factor to multiple factors. Further extensions of
Thurstone led to general factors (g), group factors (G), specific factors X = λi θ i + ε (6)
(S) and random error
where X is an observed score, λi is the correlation of the general factor
X = λ′g g + λ′G G + λS′ S + ε. (3) with a specific item, and θi is the latent value of an item, and ε is a
random disturbance, which can be generalized to general factors (g),
Because if tests are measured on just one occasion, the specific fac group factors (G), specific factors (S) and random error.
tors and error are confounded and as the number of group factors in Eq. (6) may also be expressed in terms of the factors of a covariance
creases the relative importance of the general factor will increase. Thus matrix:
evaluation of the saturation of the general factor was used as a measure
of the test's adequacy and estimates were known as measures of internal C ≈ λ λ′ + Θ2 . (7)
consistency. With the assumption of just one general factor and no group
Generalizing Eq. (6) to the include general, group and specific
factors, tests could be evaluated by the amount of general factor satu
variance, the observed score on a test item may be modeled in terms of
ration as a percentage of total variance
the sum of the products of factor scores (g, f, s, e) and loadings (c, A, D)
1′λi 1 on these factors:
ρxx = . (4)
1′C1 x = cg + Af + Ds + e (8)
where C is the covariance of the items and 1 is a vector of ones. With the Ignoring the contribution of specific variance (Ds) the reliable vari
further assumption that all λi are equal (so called τ equivalence) this ance of the test is that which is not error, the reliability of a test with
estimate is known as λ3 (Guttman, 1945) or α (Cronbach, 1951). When standardized items should be
calculations were done with desk calculators, and finding correlations ( )
was tedious and finding factors was even more tedious the charm of ωt =
1’ cc’ 1 + 1’ AA’ 1
= 1−
Σ 1 − h2i
= 1−
Σu2i
(9)
these estimates was they could be found from the variance of the total Vx Vx Vx
test (σ 2X = 1′C1) and the variances of the k items (Σ k1 (σ 2i ) and did not
where h2i is the item communality and u2i is the item uniqueness. The
require finding k * (k-1)/2 covariances. For with k items, and the
percentage of the total variance that is due to the general factor (ωg ,
assumption that λi are identical for all items, Eq. (4) becomes
McDonald, 1999) is

2
W. Revelle Personality and Individual Differences 221 (2024) 112552

1′cc′1 looks to the data to suggest the psychological structure, recognizing

ωg = that the two may lack complete isomorphism. The theorist also re
VX
quires replications with invariance of psychological factors, under
1′cc′1
= (10) somewhat varied conditions, with variations in samples of tests as
1′cc′1 + 1′AA′1 + 1′DD′1 + 1′ee′1
well as in tested populations. He may also be concerned about re
(Σci )2 lations among factors and possibly about superstructures. “Push-
=1− ,
Vx button” factor analysis has not yet achieved a fool-proof program for
grinding out invariant, generalized constructs under varied
where the total test variance (Vx ) is the sum of the elements of all the conditions.
item variances and covariances and (Σci )2 is the squared sum of the Guilford (1975, p. 803)
loadings on the general factor.
Writing such a set of equations reinforces the unfortunate separation Indeed, to some, to believe in latent variables is to believe in the
between psychometrics and psychology. For, as a leading psychome Easter Bunny (R. Hogan, personal communication).
trician suggests
2. Construct validity
Historically, psychological issues have been the driving force behind
the development of psychometric methods, beginning most In partial response to the plethora of scales developed to predict
convincingly with the work of Spearman on intelligence, factor various criteria using e.g., the MMPI (Hathaway & McKinley, 1943) or
analysis, and test-score reliability, and continued by Thurstone, the Strong Vocational Interest Test (Strong Jr., 1927) and to try to marry
Cronbach, Guilford, and many others. As psychometrics developed psychological theory with scale construction, the 1950's saw three
into a more mature area, psychometricians began looking for new monumental efforts considering the measurement of psychological
topics, and these were found in statistics and computer science constructs. Of these, perhaps the best known is that of Lee Cronbach and
perhaps more than in psychology. This not only weakened the Paul Meehl (Cronbach & Meehl, 1955) who tried to define a new type of
connection between psychological impetus and psychometric validity: construct validity. This was in striking contrast at the time
method but also created a psychometrics that was mathematically when validity was typically taken to be how well the test predicted some
more demanding for psychologists. The result of this loosened tie in criterion.
combination with more demands caused many new psychometric Constructs, as embedded in nomological networks, were seen as
tools to go unnoticed in psychology. theoretical concepts and could only be evaluated in terms of the pattern
Sijtsma (2009b, p. 172) of correlations. Criterion-oriented validation procedures, on the other
To which I will add that psychometrics drifted away from the pri hand, harkened back to the operational definitions of behaviorism.
mary mission of helping psychologists develop useful measures and Concurrent validity is the correlation with a current criterion. Predictive
instead became seduced by the beauty of latent variables. validity is the correlation with a future criterion. Content validity was
established by showing that the test items were a sample of a universe in
which the investigator is interested. Construct validation was seen as a
1.2. Scepticism about factors never ending process:
A construct is defined implicitly by a network of associations or
Although a major contributor to studies of the factorial structure of
propositions in which it occurs. Constructs employed at different
ability and temperament (Guilford, 1954, 1956), late in his carer J. P.
stages of research vary in definiteness.... Many types of evidence are
Guilford (1975) suggested that factor analytic results should not be
relevant to construct validity, including content validity, interitem
taken without caution.
correlations, interest correlations, test-“criterion” correlations,
In spite of all the negative appearances that factor analysis may give studies of stability over time, and stability under experimental
to the critical investigator, I am prepared to reiterate that the method intervention. High correlations and high stability may constitute
can be a powerful tool to aid in deriving useful psychological con either favorable or unfavorable evidence for the proposed interpre
structs. But it cannot do so without theoretical psychological tation, depending on the theory surrounding the construct.
thinking to go with it. There has been entirely too much blind faith, Cronbach and Meehl (1955, p. 200)
on the part of many who factor analyze, in what factor analysis can
An even stronger argument against predictive validity and in favor of
do. I sometimes think that its chief value is to enable us to turn data
constructs was Jane Loevinger (1957) who suggested that to study
around so we can look at them, from which new insights may arise.
prediction was not science.
But more than that, it can be used to test those insights in a kind of
Favorably quoting the economist and statistician Jacob Marshak in
hypothetico-deductive manner. Admittedly, this may not be in a way
his discussion of decision making, Loevinger said (p. 641):
some investigators would demand. Fortunately, other ways of testing
the validity of factorial constructs are available by more ordinary
“A theory provides us with solutions which are potentially useful for
experimental methods.
a large class of decisions. […] Hence, the more we know about its
Guilford (1975, p. 802)
properties the better. If we merely want to know how long it takes to
As much as we would want our theories to represent factorially boil an egg, the best is to boil one or two without going into the
defined constructs and to claim a correspondence between factors and chemistry of protein molecules. The need for chemistry is due to our
psychological systems (Royce, 1983), it is important to remember that want to do other and new things” (Marschak, 1954, p. 214). She went
factors are convenient fictions that are merely one way to organize the on to say “The argument against classical criterion-oriented psy
structure of covariance matrices (Revelle, 1983; Revelle & Ellman, chometrics is thus two-fold: it contributes no more to the science of
2016). psychology than rules for boiling an egg contribute to the science of
chemistry. And the number of genuine egg-boiling decisions which
The trend of this discussion suggests a hiatus between the orienta
clinicians and psychotechnologists face is small compared with the
tions of psychologists who factor analyze. The focus seems to be
number of situations where a deeper knowledge of psychological
either in the direction of data or of psychological Constructs, for the
theory would be helpful.”
empirical versus the theoretical analyst. The empiricist is likely to
take the data structure to be the psychological structure. The theorist

3
W. Revelle Personality and Individual Differences 221 (2024) 112552

Table 1
Self report and peer report from the SAPA-project. Correlations reported by Zola et al. (2021). Reliabilities on the main diagonal. Raw correlations below the diagonal.
Correlations corrected for reliability above the diagonal. Upper left quadrant reflects SAPA Personality Inventory scores (Condon, 2018) for 158,631 participants,
mean n/item = 18,180. Other quadrants reflect 908 peer rated participants. Values > 0.4 are highlighted in bold. Data from the zola dataset in the psychTools package.
Variable Self report Peer ratings

Agrbl Cnscn Nrtcs Extrv Opnnn Agrbl Cnscn Stblt Extrv IntlO

Agreeableness 0.87 0.32 − 0.14 0.28 0.09 0.75 0.21 0.18 0.34 0.22
Conscientiousness 0.28 0.87 − 0.20 0.13 0.06 0.16 0.78 0.22 0.42 0.13
Neuroticism − 0.12 − 0.18 0.90 − 0.28 − 0.10 − 0.01 − 0.16 ¡0.78 − 0.40 − 0.25
Extraversion 0.25 0.12 − 0.25 0.90 0.14 0.01 − 0.01 0.07 0.71 0.14
Opennness 0.08 0.05 − 0.09 0.13 0.86 − 0.14 − 0.06 0.10 0.17 0.49
Agreeableness 0.47 0.10 − 0.01 0.00 − 0.09 0.45 0.36 0.47 0.15 0.44
Conscientiousness 0.15 0.55 − 0.12 − 0.01 − 0.04 0.18 0.58 0.42 0.41 0.47
Stability 0.13 0.16 ¡0.58 0.05 0.07 0.25 0.25 0.60 0.38 0.52
Extraversion 0.23 0.28 − 0.27 0.49 0.11 0.07 0.23 0.22 0.52 0.32
IntellectOpenness 0.14 0.08 − 0.15 0.09 0.30 0.19 0.24 0.27 0.15 0.44

To which I will suggest that boiling an egg is sometimes more 2.2. Test theory
practically important than spending years studying chemistry.
With the emphasis upon constructs, much of the work in test theory
2.1. The multi-trait-multi-method matrix became how to design tests to maximize internal consistency measures
of reliability. In contrast to the earlier work by Gulliksen (1950) and
The third paper in this series emphasizing constructs was by Donald Nunnally (1978) which emphasized validity much of the past 60 years
Campbell and Donald Fiske (Campbell & Fiske, 1959) who elaborated on has emphasized reliability and internal structure and equated validity
the nomological network and introduced the concept of the Multi-Trait- with factorial validity. For a discussion of the move towards construct
Multi-Method Matrix (MTMM). They emphasized that it is the pattern of validity and away from simple prediction, see Slaney (2017).
correlations with measures of the same construct measured in the same Developments in test theory emphasized unidimensional constructs
way (reliability) as well as different ways (convergent validity) as con to be measured with “the New Psychometrics” of Item Response Theory
trasted to measures of different constructs (divergent validity). They (Embretson, 1996; Embretson & Hershberger, 1999; Reise, 1999) and
were specifically not interested in testing the utility of their measures so considered validity in terms of Structural Equation Models (Bollen,
much as the convergence of multiple measures of the same construct as 1989; Jöreskog, 1978; Wiley, 1973). IRT is based upon the concept of a
indications of validity. latent variable causing the manifest responses to items, SEM is regres
An early example of a MTMM correlation matrix was the set of sion with latent variables (observed variables corrected for measure
correlations between self ratings, self report test scores, and peer ratings ment error). These new approaches have enshrined latent variables
on 5 dimensions taken from the (Guilford, 1940) inventory of factors without considering the consequences.
reported by Carroll (1952). As would be hoped, higher convergence was Although originally requiring knowing how to code and having fa
found for traits across methods than for different traits within method. A miliarity with matrix algebra IRT and SEM procedures have become
similar approach to assess the validity of scales was proposed by McCrae easier to use without necessarily understanding when and why to use or
et al. (2011) who reported the long term stability of NEO facets, as well not use various methods. “One side of the problem is that psychologists
as the agreement of self rated facet scores with peer and spouse ratings have a tendency to endow obsolete techniques with obscure in
on those same facets. Although they do not report the discriminative terpretations. The other side is that psychometricians insufficiently
validity presumably they thought of these correlations as the diagonal communicate their advances to psychologists, and when they do they
values of a MTMM and thus as convergent mono-trait-hetero-method meet with limited success” (Borsboom, 2006, p. 428). The critiques are
validities. written in matrix notation in journals such as Multivariate Behavioral
A more recent example of a Multi-Trait-Multi-Method Matrix con Research and Psychometrika and seem to most non-experts as debating
siders the results of a validation study of traits measured by self report as the number of angels who can dance on the head of a pin.
well as by peer ratings (Zola et al., 2021). From an online sample using Our users are taught to push buttons on menu driven programs and
Massively Missing Completely at Random sampling of items (roughly to report the statistics that are seen as necessary. They are not taught to
100–200 items per subject from a pool of almost 700 items) data were think about what these various measures mean in their endless search
collected from 158,631 anonymous volunteer participants on items from for construct validity. For “construct validity functions as a black hole
the SAPA Personality Inventory (spi-135) (Condon, 2018). Correlations from which nothing can escape: Once a question gets labeled as a
were found using the Noah's Ark procedure (pairwise complete). In problem of construct validity, its difficulty is considered superhuman
addition, all participants were asked if they would nominate peers to and its solution beyond a mortal's ken.” (Borsboom, 2006, p. 431).
supply ratings on their personality. Peer ratings were thus collected on
1554 individual participants who rated 921 of the original participants 3. Prediction versus theory
on a short form of 30 items measuring 8 constructs. Table 1 shows the
correlations between five trait measures (α reliabilities on the diagonal). Although classic texts on measurement (e.g., Gulliksen, 1950; Nun
The upper left quadrant of the table shows the correlations of the self nally, 1978) devote entire chapters to issues of validity, more recently
report scales, the lower right quadrant the peer ratings. Except for the there has been less emphasis upon the practical problem of prediction
diagonal elements, these are all multi-trait-mono-method correlations. and more on the beauty of equations specifying latent variables. As
The lower left quadrant shows the raw correlations of the multi-trait- Hogan (2009) put it “Mainstream psychometrics concerns measuring
hetero-method correlations. The values above the diagonal reflect cor entities (i.e., determining ‘true scores’). But applied assessment has a job
relations corrected for attenuation. The two minor diagonals reflect the to do, and that is to predict outcomes.”
mono-trait-hetero-method validities. Although criticizing construct validity Borsboom and Mellenbergh
(2004) add an even stronger criticism of criterion validity:

4
W. Revelle Personality and Individual Differences 221 (2024) 112552

Fig. 1. α and validity as a function of the number of items and the average correlation showing the tradeoff between internal consistency and predictive validity;

“the idea of construct validity was introduced to get rid of the 4. Aggregation should be purposeful
atheoretical, empiricist idea of criterion validity, which is a
respectable undertaking because criterion validity was truly one of We have known since Spearman that test reliability goes up with test
the most serious mistakes ever made in the theory of psychological length (Fig. 1 left hand panel), as does validity (Fig. 1 right hand panel).
measurement. The idea that validity consists in the correlation be This leads us to form progressively longer scales in a hope that irrelevant
tween a test and a criterion has obstructed a great deal of under variance will diminish as a source of test variance.
standing and continues to do so.” (p. 1065) The classic example of the effects of aggregation is seen with the most
used statistic in psychology “coefficient α” (Cronbach, 1951) (Eq. (5)).
They go on to say
This measure is also known as KR-20 (Kuder & Richardson, 1937) or λ3
“Therefore, not just criterion validity but any correlational concep (Guttman, 1945). Part of the appeal of α/λ3 is that it can be found from
tion of validity is hopeless. The double-headed arrows of correlation the item variances and total test variance and is available in commercial
should be replaced by the single-headed arrows of causation, and software (Sijtsma, 2009a). Although this was convenient in the period of
these arrows must run from the attribute to the measurements”. the desk calculator, this is no longer important and so-called model
based estimates can be found from the covariances (Eqs. (9), (10)). For
“Validity is a property of tests: A valid test can convey the effect of
fixed average correlation, both α/λ3 increase with the number of items.
variation in the attribute one intends to measure. This means that the
Aggregation can also increase validity by combining k items with
relation between test scores and attributes is not correlational but
average validity ry
causal.” (p. 1067)
kry kry
ryk = = √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅. (11)
σx k + k*(k − 1)r
3.1. In defense of predictive validity
But there is an interesting contrast between Eqs. (5) and (11): “What
In striking contrast to these critiques of predictive validity is the one selects when optimizing predictive utility are items that are mutu
success of several groups of researchers concerned with vocational in ally uncorrelated but highly correlated with the criterion. This is not
terests (Dawis, 1992; Donnay, 1997; Holland, 1959; Strong Jr., 1927), what one expects or desires in measurement. Note that this does not
psychopathology (Hathaway & McKinley, 1943), or the analysis of “folk preclude that tests constructed in this manner may be highly useful for
concepts” of social interaction (Gough, 1965). Strong Jr. (1927) prediction. It does imply that optimizing measurement properties and
championed the predictive power of scales formed from items that optimizing predictive properties are not convergent lines of test con
distinguished members of a particular occupation from “People In struction.” (Borsboom & Mellenbergh, 2004, p. 1067). That is, there is a
General”. This completely empirical procedure was adapted by the de tradeoff between internal consistency and validity. This tradeoff may be
velopers of the MMPI (Hathaway & McKinley, 1943) and the CPI seen when comparing (Fig. 1 left hand panel) with (Fig. 1 right hand
(Gough, 1965). Harrison Gough was interested in predicting such panel). For while both internal consistency and validity increase with
varying criteria of socialization ranging from those seen as “best citi the number of items. The highest validity is found for those items that
zens” to incarcerated felons (Gough, 1965). Whether using the Califor lead to the lowest internal consistency.
nia Psychological Inventory (Gough, 1957) or an Adjective Check List The power of aggregation is that composite scales can include
(Gough, 1960) the goal was not a clean factor structure so much as scales important variance and reduce the contribution of extraneous error.
that worked. However, aggregation to maximize internal consistency (Eq. (5)) will
Perhaps more well known to readers of this journal or members of tend to minimize variance that is not random and not common with
ISSID is the success of the Hogan Personality Inventory (Hogan & Hogan, other items. My colleagues and I refer to such aggregation as spear-
1995). These tests are validated by their success in predicting real world fishing – developing sharp, pointed instruments with high internal
outcomes. consistency (Garner, in press; Revelle & Garner, 2023). The alternative

5
W. Revelle Personality and Individual Differences 221 (2024) 112552

Fig. 2. 10 items from Athenstaedt (2003) show a clear two factor structure representing 5 items reflecting feminine activities and five representing masculine
activities. Although the first and second sets of five items are clearly independent, both sets correlated with gender.

Table 2
Correlations of item composites corrected for item overlap. α reliabilities on the diagonal (in italics). The F and M scales show high correlations within and low between
the two sets of scales. e.g., the five F scale correlates 0.06 with the five item M scale. The data are from Athenstaedt (2003) and are available in the Athenstaedt dataset
in the psychTools package. The bottom two lines report the correlations with gender, and the ωh measure of general factor saturation. See Fig. 3 to see the validity and
internal consistency trade off.
Variable F2 F3 F4 F5 M2 M3 M4 M5 MF2 MF4 MF6 MF8 MF10 gendr

F2 0.72
F3 0.75 0.79
F4 0.77 0.80 0.82
F5 0.77 0.81 0.84 0.85
M2 0.12 0.15 0.16 0.14 0.79
M3 0.09 0.12 0.13 0.10 0.75 0.76
M4 0.09 0.12 0.13 0.10 0.77 0.78 0.81
M5 0.06 0.09 0.10 0.06 0.79 0.80 0.81 0.82
MF2 0.36 0.46 0.48 0.48 0.38 0.41 0.45 0.46 0.11
MF4 0.48 0.55 0.58 0.57 0.52 0.51 0.53 0.53 0.46 0.59
MF6 0.52 0.56 0.58 0.58 0.55 0.54 0.56 0.56 0.56 0.66 0.69
MF8 0.54 0.58 0.60 0.59 0.58 0.57 0.58 0.57 0.61 0.71 0.73 0.75
MF10 0.54 0.59 0.61 0.60 0.59 0.57 0.58 0.57 0.63 0.73 0.75 0.77 0.77
gender 0.52 0.57 0.58 0.56 0.54 0.55 0.54 0.52 0.67 0.71 0.75 0.74 0.74 1.00
ωh 0.72 0.79 0.69 0.71 0.79 0.77 0.7 0.69 0.11 0.13 0.23 0.24 0.15

approach is to use a net – diffuse scales that include multiple items with tional criteria (Nunnally, 1978) “acceptable” with values of 0.77. That is
criterion validity, even if not highly associated with each other. As we to say, we would expect such a 10 item scale to correlate 0.77 with a
suggest, you can catch more fish with a net than a spear. parallel measure. But from the point of view of whether these scales
Consider the correlations of 10 items from Athenstaedt (2003) that measure one thing, they clearly do not. The ωh values of 0.15 suggest
are discussed by Eagly and Revelle (2022) (Fig. 2). These items are that just 15 % of the variance is due to one latent factor.
included in the Athenstaedt data set in the psychTools package (Revelle, That is, from a traditional measurement point of view, the MF scales
2023b) for the R statistical system (R Core Team, 2023). The analyses are clearly inadequate for they do not represent one construct. Just 11 to
and graphics were done using the psych package (Revelle, 2023a) in R. 15 % of their variance is common to the scale. But their predictive
Using the inter-ocular trauma test for the number of factors, these 10 validity is far superior to that of the “better” scales that are purer
items clearly represent 2 independent factors. Although the sets of items measures of a single construct. As Eagly and Revelle (2022) said “the
are basically orthogonal, they all correlate with gender. We can find patterning of psychological gender/sex differences can be difficult to
composite scales of these items by combining the first 2, 3, 4 or 5 from discern in narrowly defined attributes but emerges more strongly in
each factor (F2…, F5, M2… M5) or composite scales of 1, 2, 3, 4, 5 from general trends. It follows that neither similarity nor difference prevails
each set (MF2, MF4, MF6, MF8, MF10). (Table 2). Just M or just F scales but instead a more complex intertwining of these two types of findings”.
are very internally consistent (ωh = 0.72…0.85) and reasonably valid This tradeoff between validity and internal consistency is seen in Fig. 3
(rgender = 0.52…0.58). But the composite (MF) scales are much less which plots the validity correlations against the ωh measures of general
internally consistent (ωh = 0.11…0.23, α = 0.11…0.77) and more valid factor saturation.
(rgender = 0.67…0.75). We have previously reported similar findings (Eagly & Revelle,
It is interesting to compare the two indicators of internal consistency. 2022) using a data set from Gruber et al. (2020) which also show the
The conventional measure for the 10 item MF scales, α, is by conven power of aggregation and the benefit of aggregating independent

6
W. Revelle Personality and Individual Differences 221 (2024) 112552

5. Structure of ability and temperament

5.1. Ability

One of Spearman's great contributions was the recognition of the

positive manifold of cognitive ability. That is, that measures of cognitive
ability are all positively correlated and could be identified by having
positive loadings on a general factor (Borg, 2018). This observation
should not, however, be taken to imply that there is a general causal
factor of ability, for factors are merely one way of representing corre
lational structure. There are interesting alternative explanations for the
positive manifold other than Spearman's g. For as Thomson (1916)
pointed out with his independent “bonds” model, rather than one
overarching g, tests can correlate because they represent a number of
overlapping features. This important idea has been discussed by Bar
tholomew et al. (2009) and can be simulated by the sim.bonds function
in psych. The Thomson bonds model has also been applied to discussions
of the factor structure of temperament items (McCrae, 2014).
Yet another way to achieve a positive manifold has been proposed by
Kovacs and Conway (2016, 2019) as multiple processes that grow
together. A different development perspective of the meaning of the
positive manifold is the observation that that scores on various cognitive
measures change at different rates over time (Flynn, 1987). This set of
findings calls into question the simple g as primary cause model. The
Fig. 3. Showing the tradeoff between prediction and internal consistency as discussion in the last part of that article should be required reading to all
indexed by ωh . The values are taken from Table 2 and are the correlations of 8 who study ability.
unidimensional scales and 5 multidimensional scales with gender as a function Although any positive manifold can be factored to produce lower
of the general factor saturation ωh of each scale. The composite scales, although level (group) and a higher level (g) factor, this says nothing about
not reflecting a single latent variable, are clearly more valid but less internally causality. Higher order factors no more imply causality than the positive
consistent than are the unidimensional scales. manifold of size variables implies a common factor of “bigness” (Fig. 4
panel B). As an example of a higher level factor structure in cognitive
dimensions. Whether considering scales of personality, cognitive or ability consider the 16 items from the “ICAR sample items” found in the
behavioral activity, combining uncorrelated measures with high inter psychTools package. These items are part of a larger project (the ICAR
nal consistency produced scales that were much more valid but were project) to develop open source ability items. Originally developed by
clearly not measures of a single latent factor. Condon and Revelle (2014) and then working with colleagues in the UK

Fig. 4. Hierarchical analysis of 16 ability from the ICAR (panel A) and 19 size measures from the United States Airforce (panel B). Data sets in the psychTools package
are ability and USAF respectively. Measures of internal consistency: ωh = 0.66, 0.53, α = 0.83, 0.90, ωt = 0.86, 0.95 for ability and size respectively.

7
W. Revelle Personality and Individual Differences 221 (2024) 112552

and Germany, the ICAR project now has 17 item types and a database of
several thousand items (Dworak et al., 2021; Revelle et al., 2020). These
items show the traditional hierarchical structure of ability items (Fig. 4
panel A).
This hierarchical structure is remarkably similar to that of 19 mea
sures of physical size taken from the United States Airforce which also
show a higher level factor structure (Fig. 4 panel B). This factor, best
summarized as physical size cannot be said to be a cause of arm length or
chest diameter. For size is a formative sum of the component
measurements.

5.2. Temperament

Although in the late 1960s, some Americans thought personality did

not exist, this was not true in Europe where researchers continued to
discuss the genetic and physiological basis of personality (Eysenck,
1967; Revelle, 1989).1 Finally, recognizing that perhaps personality
traits did indeed show consistency across situations and over time, de
bates between alternative structural models focused on three (Eysenck,
1990; Peabody, 1967), five (Costa & McCrae, 1992; Digman, 1990;
Goldberg, 1990), seven (Comrey, 2008), and even sixteen (Cattell &
Stice, 1957) basic dimensions. After a consensus upon a five factor
model became somewhat accepted, the debate continued as to whether Fig. 5. Predicting 8 criteria from the spi data set. The values shown are the
one general factor (Musek, 2007; Revelle & Wilt, 2013), or two higher cross validated multiple correlations from five higher order factors, 27 lower
order (Digman, 1997) better captures the personality space. The debate level factors, and the bestScales solutions. N derivation = 2000; N cross vali
continues to this day with some suggesting that the consensual Big Few dation = 2000.
structure is a useful organizing framework (Bainbridge et al., 2022)
while others discuss how this structure is not replicable across cultures, attainment, and divorce) Roberts et al. (2007) showed robust, but small
or even within the natural language (Condon, 2023; Cutler & Condon, effects. They point out, however, these effects are equivalent in
2023). magnitude to the effects of Social Economic Status or cognitive ability.
Analogous to the questions of structure in personality is the debate Although it is not clear what specific trait theories predict that prudent
about the structure of psychopathology. Influential work suggesting and conscientious people tend to live longer and have more stable
common factors to personality disorders was based on converting marital relationships these results are important. They are, however,
“comorbidities” of diagnostic categories into tetrachoric correlations more descriptive than theory driven findings. They do show that there is
and then factoring the resultant matrices (Krueger & Markon, 2006a, something about the aggregation of items assessing prudent behavior
2006b; Markon et al., 2005). These findings led to the “HiTOP” model that enhances prediction.
(Forbes et al., 2021) as an attempt of organizing all of psychopathology Unfortunately, in reviewing the power of personality to predict real
into a single hierarchical model. However, this organization is not outcomes, Roberts and his colleagues ignored an important part of
without its critics who suggest the analogy of the ‘p’ factor of psycho personality: interests. People spend most of their lives working.
pathology with the ‘g’ factor of ability is incorrect and not helpful (Watts Knowing what influences their choice of occupation is not just the Big
et al., 2023). Few or even the Facets or Nuances of traditional personality instruments
Furthermore, that measures of personality and psychopathology can (Anni et al., 2023). Impressive as the analysis of 263 occupation in terms
be described as formative rather than reflective indicators (Jonas & of personality profiles (Anni et al., 2023) is, they continue in the un
Markon, 2016) has major implications to their use. For if they are fortunate tradition in academic personality research to ignore interests,
formative, our latent variables are just descriptive summaries of the perhaps because they are seen as too practical and useful.
items rather than causal (Bollen, 2002; Howell et al., 2007). Seemingly less known to most academic personality researchers is a
substantial literature in counseling as well as industrial-organizational
6. Prediction psychology that discusses the power of interests to predict job choice
(Armstrong et al., 2004; Donnay & Borgen, 1996; Su et al., 2019). Much
But how much did these debate about personality structure help our of this work is “dustbowl empiricism” inspired by Strong Jr. (1927) who
understanding of the causes and consequences of personality? Science is spent a lifetime developing scales that predicted satisfaction with jobs. A
about prediction and understanding. The use of latent variables which fairly common organization of the Strong scales (Donnay et al., 2005) is
are factorially pure supposedly helps us understand our variables and the Realistic, Investigative, Artistic, Social, Enterprising and Conven
further our theories. But how well do these latent variables actually help tional (RIASEC) model of Holland (1996) which suggested the six per
us predict real criteria? The distinction between prediction and under sonality “types” flourish in appropriate environments. The six types are
standing is not new, for it has been raised before (e.g., Möttus et al., said to be able to be summarized in a circumplex with the axes of ideas
2020; Yarkoni & Westfall, 2017), but it is worth reminding those of us versus data and people versus things. An alternative representation of
who were seduced by latent variable that there are important alterna the axes is that of Hogan (1982) who posited sociability and prudence as
tives to theory driven approaches. the primary axes. Su et al. (2019) points out that “Interests have also
Prediction of real world phenomena is hard and effect sizes tend to be been shown to have incremental validity over cognitive ability and
small (but important). In their extensive review of the power of per personality traits in predicting job performance” (p. 1) and then went
sonality to predict meaningful criteria (life span, occupational beyond the traditional six clusters of the RIASEC to introduce an eight
dimensional model (SETPOINT) based upon factor analysis of interest
items. Their work is an example of the seductive beauty of latent vari
1
For a history of the “dark ages of personality,” see Revelle et al. (2011, chap. ables for they go beyond simple empirically derived scales in their
1). attempt at finding a clean CFA structure.

8
W. Revelle Personality and Individual Differences 221 (2024) 112552

Table 3
Various estimates of internal structure for 5 “Big Few” and 27 lower level scales from the spi dataset. For a list of the items and scoring keys for these scales, see the help
page for the spi dataset in the psychTools package. Calculations done using the reliability function in the psych package. The first three columns are the traditional
measures of internal consistency, the next three represent three measures of unidimensionality, the next two are results of split half analyses and represent the best and
worst split half reliabilities. The final three columns report the mean and median inter-item correlations and the number of items per scale.
Variable ωh α ωt Uni τ ρp Max split Min split r Median r N items

Agree 0.55 0.87 0.89 0.69 0.80 0.86 0.91 0.66 0.32 0.25 14
Consc 0.58 0.86 0.88 0.75 0.84 0.90 0.91 0.70 0.30 0.27 14
Neuro 0.61 0.90 0.92 0.84 0.90 0.94 0.94 0.75 0.40 0.36 14
Extra 0.66 0.89 0.91 0.82 0.89 0.92 0.94 0.77 0.38 0.34 14
Open 0.47 0.84 0.86 0.68 0.77 0.88 0.89 0.62 0.27 0.22 14
Compassion 0.80 0.88 0.89 0.99 0.99 1.00 0.87 0.82 0.59 0.58 5
Trust 0.80 0.87 0.89 0.99 0.99 1.00 0.87 0.81 0.58 0.58 5
Honesty 0.71 0.81 0.84 0.96 0.97 0.99 0.83 0.70 0.46 0.46 5
Conservatism 0.56 0.78 0.85 0.82 0.90 0.91 0.84 0.61 0.41 0.35 5
Authoritarianism 0.63 0.81 0.86 0.89 0.93 0.95 0.85 0.63 0.46 0.46 5
EasyGoingness 0.45 0.68 0.76 0.90 0.92 0.98 0.73 0.58 0.29 0.29 5
Perfectionism 0.34 0.70 0.74 0.82 0.83 0.99 0.72 0.53 0.31 0.33 5
Order 0.62 0.81 0.85 0.92 0.94 0.99 0.83 0.66 0.46 0.42 5
Industry 0.72 0.84 0.86 0.99 0.99 1.00 0.84 0.76 0.52 0.50 5
Impulsivity 0.72 0.87 0.90 0.98 0.98 1.00 0.87 0.80 0.58 0.58 5
SelfControl 0.49 0.76 0.83 0.90 0.94 0.96 0.80 0.60 0.39 0.36 5
EmotionalStability 0.65 0.85 0.89 0.98 0.98 1.00 0.84 0.76 0.52 0.50 5
Anxiety 0.83 0.90 0.91 0.99 0.99 1.00 0.89 0.83 0.64 0.62 5
Irritability 0.78 0.89 0.91 0.98 0.99 0.99 0.89 0.79 0.61 0.60 5
WellBeing 0.80 0.90 0.92 0.99 0.99 1.00 0.90 0.81 0.63 0.63 5
EmotionalExpressiveness 0.73 0.80 0.83 0.92 0.93 0.99 0.83 0.68 0.45 0.43 5
Sociability 0.66 0.85 0.89 0.97 0.98 0.99 0.85 0.75 0.53 0.50 5
Adaptability 0.62 0.80 0.84 0.92 0.93 0.99 0.82 0.68 0.44 0.42 5
Charisma 0.67 0.82 0.86 0.94 0.96 0.98 0.84 0.72 0.47 0.43 5
Humor 0.68 0.78 0.82 0.91 0.92 0.99 0.81 0.64 0.42 0.40 5
AttentionSeeking 0.80 0.88 0.90 0.92 0.93 0.99 0.89 0.77 0.58 0.67 5
SensationSeeking 0.77 0.86 0.89 0.97 0.98 0.99 0.87 0.77 0.55 0.54 5
Conformity 0.67 0.82 0.87 0.89 0.93 0.96 0.85 0.67 0.47 0.47 5
Introspection 0.56 0.78 0.84 0.92 0.93 0.99 0.81 0.68 0.41 0.41 5
ArtAppreciation 0.68 0.80 0.83 0.89 0.90 0.99 0.81 0.65 0.44 0.46 5
Creativity 0.70 0.85 0.86 0.97 0.97 1.00 0.85 0.77 0.52 0.53 5
Intellect 0.81 0.86 0.87 0.99 0.99 1.00 0.84 0.78 0.54 0.52 5

Table 4
Descriptive statistics for the eight criteria used in the examples from the spi dataset. The trimmed mean represents the mean with the top and bottom 10 % removed.
The Mad is the median absolute difference from the median. For a discussion of the estimates of skewness and kurtosis see the help pages for describe in the psych
package.
Variable Vars n Mean SD Median Trmmd Mad Min Max Range Skew Krtss SE

health 1 3536 3.51 0.98 4 3.54 1.48 1 5 4 − 0.25 −0.42 0.02

p1edu 2 3051 4.72 2.39 5 4.77 4.45 1 8 7 − 0.11 −1.33 0.04
p2edu 3 2896 4.33 2.32 5 4.28 4.45 1 8 7 0.09 −1.33 0.04
education 4 3330 4.10 2.21 3 4.00 1.48 1 8 7 0.41 −1.04 0.04
wellness 5 3311 1.54 0.50 2 1.55 0.00 1 2 1 − 0.17 −1.97 0.01
exer 6 3310 3.57 1.60 4 3.60 1.48 1 6 5 − 0.35 −1.06 0.03
smoke 7 3348 2.19 2.04 1 1.70 0.00 1 9 8 1.83 2.19 0.04
ER 8 3347 1.16 0.48 1 1.03 0.00 1 4 3 3.42 12.74 0.01

In a practical sense, the question about the utility of theory versus and finding meaning in life (Gottlieb et al., 2021; Hogan, 1982; Hogan &
prediction has been answered by the success of companies that develop Blickle, 2018) and emphasize predictive rather than factorial validity.
instruments to predict employee success by using proprietary in Combining multiple dimensions is better than any single dimension.
struments. Rather than adopt factorially pure instruments with high Thus Hogan et al. (1994) in their review of personality and leadership
construct validity, these companies emphasize scales that discriminate effectiveness cite literature that surgency, emotional stability, and
successful from unsuccessful workers. Criteria of interest include conscientiousness predict better leadership performance.
absenteeism, theft, malicious behaviors and general dishonesty or lack The debate about scale construction procedures between those fa
of integrity (Hogan et al., 1996; Hogan & Sherman, 2020). Predictive voring latent variable models, those favoring theory driven models, and
validity is shown for truck drivers, service dispatchers, or machine op those using criterion oriented scales was addressed by Hase and Gold
erators. The success of this approach may be seen by the number of berg (1967) who reached the conclusion that all of these procedures
companies that use these proprietary instruments. Their instruments are were about equally effective when predicting a variety of criteria. In a
broadly theory relevant, e.g., socioanalytic theory suggests that we monumental followup which also addressed basic scale construction
should study the interpersonal challenges of getting along, getting ahead principles, Goldberg (1972) came to somewhat different conclusions,

9
W. Revelle Personality and Individual Differences 221 (2024) 112552

Table 5
Standardized β weights for 5 and 27 predictors of 8 criteria. Also shown are the multiple R values for the derivation sample (N = 2000) and cross validation sample (N
= 2000). Although values r > 0.075 have Bonferroni adjusted probabilities of < 0.01, I highlight (in bold) those β > 0.1. Calculations done with the lmCor and
crossValidation functions in the psych package.
Variable p1edu p2edu ER wllns smoke exer edctn helth

(Intercept) 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Agree 0.02 0.01 − 0.03 0.03 ¡0.10 ¡0.03 0.11 0.02
Consc − 0.02 − 0.04 0.01 0.11 − 0.06 0.15 0.04 0.16
Neuro − 0.04 − 0.03 0.12 0.02 0.06 ¡0.15 ¡0.14 ¡0.27
Extra 0.05 0.07 0.04 0.11 0.07 0.11 − 0.09 0.14
Open 0.09 0.10 − 0.03 0.00 0.08 0.05 0.13 0.04
R-derivation 0.13 0.14 0.12 0.17 0.17 0.28 0.24 0.41
R-cross valid 0.06 0.07 0.14 0.16 0.19 0.25 0.28 0.40
(Intercept) 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Compassion 0.04 − 0.02 0.05 0.05 0.00 − 0.02 0.03 − 0.03
Trust 0.03 0.07 − 0.06 − 0.02 − 0.09 0.01 0.04 0.03
Honesty ¡0.10 ¡0.06 0.01 ¡0.04 0.02 0.00 0.10 ¡0.03
Conservatism 0.02 − 0.01 0.04 0.04 0.00 0.02 − 0.03 − 0.01
Authoritarianism − 0.02 − 0.04 − 0.01 0.05 ¡0.16 ¡0.08 ¡0.09 ¡0.01
EasyGoingness − 0.08 − 0.05 − 0.04 − 0.07 0.05 ¡0.17 ¡0.05 ¡0.10
Perfectionism 0.03 0.05 0.01 0.02 − 0.02 0.01 − 0.03 0.02
Order − 0.06 − 0.05 − 0.04 − 0.02 0.00 0.09 0.04 0.05
Industry − 0.06 − 0.05 0.00 0.01 0.11 − 0.02 0.03 − 0.01
Impulsivity − 0.05 − 0.04 − 0.01 0.00 0.04 − 0.03 − 0.02 − 0.05
SelfControl 0.05 0.07 0.03 0.01 ¡0.18 0.08 ¡0.10 0.14
EmotionalStability − 0.07 − 0.04 − 0.04 0.00 0.06 − 0.06 0.05 − 0.08
Anxiety − 0.01 0.04 0.06 0.01 0.03 − 0.08 ¡0.11 ¡0.12
Irritability − 0.08 ¡0.11 ¡0.01 0.03 0.01 ¡0.04 0.00 ¡0.05
WellBeing 0.10 0.05 − 0.02 0.05 − 0.05 0.09 0.04 0.29
EmotionalExpressiveness − 0.02 − 0.03 − 0.02 0.05 0.07 − 0.07 0.13 − 0.03
Sociability 0.07 0.06 − 0.03 0.00 − 0.03 0.10 ¡0.14 0.05
Adaptability − 0.03 − 0.05 ¡0.12 ¡0.06 ¡0.04 ¡0.02 0.09 0.00
Charisma − 0.07 − 0.07 0.05 0.07 0.15 0.04 − 0.03 − 0.04
Humor 0.01 0.05 0.04 0.07 − 0.06 0.05 ¡0.14 0.04
AttentionSeeking 0.02 0.09 − 0.01 − 0.04 − 0.01 − 0.07 0.10 0.03
SensationSeeking − 0.04 − 0.02 0.14 0.04 0.01 0.10 ¡0.18 0.11
Conformity − 0.04 − 0.04 0.04 0.06 0.04 0.01 0.07 0.02
Introspection − 0.01 0.05 − 0.06 − 0.02 0.03 0.03 0.09 0.08
ArtAppreciation 0.06 0.02 − 0.04 0.05 0.02 − 0.01 0.04 − 0.05
Creativity 0.04 0.00 0.09 − 0.01 0.02 − 0.01 0.00 − 0.06
Intellect 0.06 0.06 − 0.08 0.04 − 0.02 0.00 0.11 0.01
R-derivation 0.23 0.25 0.24 0.24 0.32 0.37 0.41 0.49
R-cross valid 0.13 0.15 0.17 0.23 0.29 0.33 0.40 0.46

10
W. Revelle Personality and Individual Differences 221 (2024) 112552

0.3
Manhattan Plot of health Manhattan Plot of exer

0.3
Correlations with health

Correlations with exer

0.2

0.2
0.1

0.1
0.0

0.0
Agree
Consc
Neuro
Extra
Open
Compassion
Trust
Honesty
Conservatism
Authoritarianism
EasyGoingness
Perfectionism
Order
Industry
Impulsivity
SelfControl
EmotionalStability
Anxiety
Irritability
WellBeing
EmotionalExpressiveness
Sociability
Adaptability
Charisma
Humor
AttentionSeeking
SensationSeeking
Conformity
Introspection
ArtAppreciation
Creativity
Intellect

Agree
Consc
Neuro
Extra
Open
Compassion
Trust
Honesty
Conservatism
Authoritarianism
EasyGoingness
Perfectionism
Order
Industry
Impulsivity
SelfControl
EmotionalStability
Anxiety
Irritability
WellBeing
EmotionalExpressiveness
Sociability
Adaptability
Charisma
Humor
AttentionSeeking
SensationSeeking
Conformity
Introspection
ArtAppreciation
Creativity
Intellect
Fig. 6. Manhattan plots organize individual item validities by 5 higher order Agree.. Open and 27 lower order factors. The data are the derivation sample from the
spi. N = 2000. The dashed line represents the Bonferroni adjusted level of significance at the p < 0.01 level.

Table 6
20 spi items that best predict exercise. The last two columns identify items that are markers (if they are) of the five higher order factors and then the 27 lower level
factors. The item numbers correspond to those from Condon (2019). The item validities are the means of 10 folds. Estimates of internal consistency: ωh = 0.62, α =
0.88, ωt = 0.90, u = 0.69, rexercise = 0.33.
Variable Mean r Item B5 L27

q_1024 − 0.24 Hang around doing nothing. EasyGoingness

q_1052 − 0.23 Have a slow pace to my life. EasyGoingness
q_811 − 0.21 Feel a sense of worthlessness or hopelessness. Neuro WellBeing
q_1662 0.20 Seek adventure. SensationSeeking
q_1505 − 0.20 Panic easily. Neuro Anxiety
q_1371 0.19 Love life. WellBeing
q_808 − 0.19 Fear for the worst. Neuro Anxiety
q_1452 − 0.19 Neglect my duties. Consc Industry
q_2765 0.18 Am happy with my life. WellBeing
q_4249 − 0.18 Would call myself a nervous person. Neuro Anxiety
q_312 − 0.18 Avoid company. Extra Sociability
q_1444 − 0.18 Need a push to get started. Consc Industry
q_56 0.18 Am able to control my cravings. SelfControl
q_820 0.18 Feel comfortable with myself. WellBeing
q_254 0.17 Am skilled in handling social situations. Extra Charisma
q_578 − 0.17 Dislike myself. Neuro WellBeing
q_1254 − 0.16 Leave a mess in my room. Consc Order
q_1483 − 0.16 Often forget to put things back in their proper place. Consc Order
q_1979 0.16 Work hard. Consc Industry
q_1201 0.16 Keep things tidy. Consc Order

11
W. Revelle Personality and Individual Differences 221 (2024) 112552

Table 7
20 spi items that best predict health. The last two columns identify items that are markers (if they are) of the five higher order factors and then the 27 lower level
factors. The item validities are the means of 10 folds. Estimates of internal consistency: ωh = 0.64, α = 0.90, ωt = 0.92, u = 0.37, rhealth = 0.43.
Variable Mean r Item B5 L27

q_820 0.38 Feel comfortable with myself. WellBeing

q_578 − 0.35 Dislike myself. Neuro WellBeing
q_811 − 0.35 Feel a sense of worthlessness or hopelessness. Neuro WellBeing
q_2765 0.35 Am happy with my life. WellBeing
q_1371 0.33 Love life. WellBeing
q_808 − 0.28 Fear for the worst. Neuro Anxiety
q_1505 − 0.27 Panic easily. Neuro Anxiety
q_4249 − 0.27 Would call myself a nervous person. Neuro Anxiety
q_56 0.26 Am able to control my cravings. SelfControl
q_4252 − 0.26 Am a worrier. Neuro Anxiety
q_1989 − 0.25 Worry about things. Neuro Anxiety
q_1452 − 0.25 Neglect my duties. Consc Industry
q_1024 − 0.24 Hang around doing nothing. EasyGoingness
q_254 0.23 Am skilled in handling social situations. Extra Charisma
q_39 0.22 Adjust easily. Adaptability
q_312 − 0.21 Avoid company. Extra Sociability
q_1444 − 0.21 Need a push to get started. Consc Industry
q_979 − 0.21 Get overwhelmed by emotions. Neuro EmotionalStability
q_952 − 0.21 Get angry easily. Irritability
q_1052 − 0.21 Have a slow pace to my life. EasyGoingness

showing how factorially based scales worked better on easy to predict For each of these eight criteria, Fig. 5 shows the cross validated
criteria, but that criterion oriented techniques were better with harder to multiple correlations for scales representing the Big Few, the “little 27”,
predict criteria. Hase and Goldberg (1967); Goldberg (1972) examined as well as scales formed from finding the best cross validated items using
468 unique items taken from the CPI to predict 13 different criteria for a the bestScales function. Although all the β values for the 5 and 27 pre
total sample of just 152 subjects. Being firm believers in the need to cross dictors on the 8 criteria are shown in Table 5, for conciseness, I just
validate their results, the derivation and cross validation samples had discuss self ratings of wellness and reported exercise. The three largest β
just 76 participants. Using much larger samples, my colleagues and I weights suggest that Exercise is done more by people who are high on
have found that empirical item level and lower level factor scales conscientious, emotional stability and more extraverted. These same
dominate high level factor based prediction (Revelle et al., 2021). Here I three factor based scales predict self ratings of health, but with a bigger
elaborate on those findings. effect for emotional stability and an overall larger R. When examining
these relationships in more detail, by looking at the lower level factor/
scales, we see that Exercise is associated with not being easy going, but
6.1. Examples of prediction at the scale level being sociable and a seeking stimulation. Health is also associated with
not being easy going, but is particularly associated with well being, low
At a more micro level, I have already used the example of predicting anxiety, self control and sensation seeking.
gender from various stereotypical gender items (Table 2, Fig. 3) to show
that increasing internal consistency does not necessarily lead to in
creases in validity. In fact, there is a well known (but forgotten) tradeoff 6.2. Prediction at the item level
between the two. I now consider a more complicated example which
uses dimensions that are commonly seen in personality research and In addition to using higher level and lower level factors/scales, it is
examine predicting a set of 8 criteria using three levels of analysis also possible to use the items themselves. A graphical demonstration of
(Fig. 5). how subsets of items from each of these higher level or lower level
For reproducibility of my results, I use data from the spi dataset in factors relate to the criteria is shown as a pair of “Manhattan” plots
the psychTools package and include the relevant R code in Appendix A. (Fig. 6). These two plots show the zero order correlations for each item
The spi dataset was collected as part of the SAPA project discussed in each scale with the criteria. Thus, although Neuroticism correlates
earlier and includes 135 items from Condon (2018). These 135 were − 0.27 with health, we can see that this is due to about seven of the 14
carefully curated from a larger set of 696 items which in turn were taken items in the scale and the high correlation of well being with health
from the more than 2000 items in the International Personality Item reflects the high correlations of all of the items in that short scale.
Pool (Goldberg et al., 2006). Of these 135 items, 70 may be formed into A more detailed pattern for exercise and health is found by looking at
5 higher level composites representing the Big Few, while all 135 items the items that are most descriptive. A simple “machine leaning” algo
can be scored for 27 different lower level item composites. Conventional rithm, implemented in the bestScales function identifies those items
estimates of internal consistency (ωh , α, ωt ) as well as various measures of which are most related to a criterion in each of 10 “folds” of the data. K-
unidimensional structure (Revelle & Condon, 2023) are shown in fold cross validation splits the data into k folds, and treats N*(k-1)/k
Table 3. As expected (Widaman & Revelle, 2023a, 2023b) scale scores participants as the derivation sample and N/k as the cross validation
found by unit weighting of the keyed items match factor score estimates sample. Pooled cross validation coefficients are then used to choose the
with all correlations > 0.97 (Table 4). “best” items. We have compared bestScales to more conventional tech
Because of the well known need to cross validate any empirical niques such as LASSO regression and finds that it performs about as well
finding (Cureton, 1950), all analyses were done on a randomly chosen (Elleman et al., 2020). The advantage of bestScales is that it is
50 % of the data and then the resulting β weights were applied to the completely transparent and produces a list of the best items for any
other 50 % of the sample. With the sample sizes I am using, (derivation criteria. Given that SAPA data normally has a high degree of missingness
N = 2000, cross validation N = 2000) the amount of shrinkage in the (by design) and that it works on both raw data as well as covariance
cross validation samples was minimal (compare the multiple R values matrices, we have found bestScales to be particularly useful.
for the derivation and cross validation samples in Table 5).

12
W. Revelle Personality and Individual Differences 221 (2024) 112552

Based upon the zero order correlations, we see that Extraverts ex finite number of items, factor score estimates are not latent variables,
ercise more (r = 0.13) or that the linear regression of Extraversion + they are merely weighted sum scores. Focusing on measures of internal
Conscientiousness combines the need for stimulation with the belief that consistency at the cost of focusing on predictive validity is a mistake.
exercise is healthy (R = 0.22). Or we can use lower level constructs that An alternative to the simple factor model of scale construction was
suggest people with a high sense of well being, who are not easygoing proposed by McCrae (2014) in his distinction between scales as the
and are high in industriousness exercise more (R = 0.33). Finally, we can intersection of items versus the union of items. Reconceptualizing our
find (and cross validate) the items that actually predict exercising (R = scales as formed from the union of multiple items that carry unique
0.33) (Table 6) or health (R = 0.43) (Table 7). All of these are reasonable information makes problems in Differential Item Functioning and
levels of understanding and prediction. It is important to point out the factorial invariance less challenging than thinking of homogeneous
multiple regressions done with the little 27 were based upon 135 items scales all meant to measure one latent construct. Consider the case of sex
(5 items per scale), the bestsScales results were based upon just the 20 differences in depression. Items measuring depression (e.g., “In the past
items most related to each criteria. week I have felt downhearted or blue” or “In the past week I felt hopeless
about the future”) have roughly equal endorsement characteristics for
7. Discussion and conclusions males and females. But the item “In the past week I have cried easily or
felt like crying” has a much higher threshold for men than for women
The tension between theory and prediction has been with us for (Schaeffer, 1988; Steinberg & Thissen, 2006) indicating a much higher
many years. Empirically based scale construction using items to predict level of depression for men who endorse the item. Similarly, lack of
outcomes is not a new idea (e.g., Hathaway & McKinley, 1943; Stewart factorial invariance across cultures is not a reason to reject a scale, but is
et al., 2022; Strong Jr., 1927, 1947) although it seems to have been a reason to more carefully investigate the pattern of item differences
forgotten by those who prefer constructs and latent variables. The across these cultures. Discussions of DIF in terms of relative versus ab
elegance of the arguments for construct validity (Cronbach & Meehl, solute measurement help clarify the need to examine the meaning of
1955; Loevinger, 1957) and the sheer pleasure of successfully doing a items before leaping to conclusions about factor invariance at the scale
factor analysis or structural equation model has seduced us from the level (Borsboom et al., 2002).
path towards predicting outcomes.
With the advent of very large data bases and recognizing the need for 7.1. Conclusions
cross validation, the empirical approach has become popular in other
fields. For knowing how to add (find sum scores) is, after all, the basic In the preceding pages I have taken the somewhat radical position
principle of polygenic risk scores used in Genome Wide Association that our emphasis upon latent variables and construct validity as an
Studies (GWAS) or in risk scores for medical outcomes. GWAS identifies attempt to understand the structure of personality has been done at the
the single SNPs correlated with outcomes as diverse as height or years of cost of showing that personality is actually useful. Although it is much
education which are then summed to produce a single score (the PRS). easier (and more enjoyable) to talk about theories of Extraversion and
The effectiveness of PRS is evaluated by correlation with the criterion Neuroticism (Eysenck, 1967) or Impulsivity and Anxiety (Gray, 1981,
variable. While the effect of each SNP is trivial (but reliable given the 1987), to use these higher level dimensions in predicting real outcomes
sample sizes used), the combined scores have much larger effects. Thus is difficult. For to predict specific outcomes it is better to resort to short,
Lee and his colleagues formed a PRS for years of education that could non-homogenous tests made up of the specific items that actually work.
explain 11 % of the variance (Lee et al., 2018) from the composite score Such scales are formative measures that do not reflect some underlying
of 1271 unrelated SNPs. Not using GWAS, but just combining unrelated latent cause, but are merely the observed sums of observed variables. We
predictors is seen in the Environmental Risk Scores for psychosis (Vassos should stop believing in the Easter Bunny.
et al., 2020) or the Environment Wide Association Studies to quantify
general health risks of environmental pollutants (Park et al., 2014). All Declaration of competing interest
of these studies are using SNPs as items in formative measures of risk.
They do not posit a latent variable causing the SNPs. The authors declare that they have no known competing financial
Although most users of SEM think of the items as reflective indicators interests or personal relationships that could have appeared to influence
of latent variables, the alternative is to recognize that many of our latent the work reported in this paper.
variables are just formative sums of independent items. I am not denying
the power of aggregation to form better measures, I am just suggesting Acknowledgment
that our measures need to be recognized for what they are: sums of in
dependent items which do not necessarily, and frequently do not, have I would like to thank David Condon, David Funder, Kayla Garner,
anything in common. That is, to think of a scale as more than a simple Lew Goldberg, Robert Hogan, René Mõttus, and Daniel Ozer for their
sum and to reify it as some latent variable is to mislead ourselves. With a comments and suggestions.

Appendix A. R code for analyses

13
W. Revelle Personality and Individual Differences 221 (2024) 112552

14
W. Revelle Personality and Individual Differences 221 (2024) 112552

References Dworak, E. M., Revelle, W., Doebler, P., & Condon, D. M. (2021). Using the International
Cognitive Ability Resource as an open source tool to explore individual differences in
cognitive ability. Personality and Individual Differences, 169. https://doi.org/
Anni, K., Vainik, U., & Möttus, R. (2023). Personality profiles of 263 occupations.
10.1016/j.paid.2020.109906
psyarxiv/ajvg2. https://osf.io/preprints/psyarxiv/ajvg2.
Eagly, A. H., & Revelle, W. (2022). Understanding the magnitude of psychological
Armstrong, P. I., Smith, T. J., Donnay, D. A., & Rounds, J. (2004). The Strong ring: A
differences between women and men requires seeing the forest and the trees.
basic interest model of occupational structure. Journal of Counseling Psychology, 51,
Perspectives on Psychological Science, 17, 1339–1358. https://doi.org/10.1177/
299–313. https://doi.org/10.1037/0022-0167.51.3.299
17456916211046006
Athenstaedt, U. (2003). On the content and structure of the gender role self-concept:
Elleman, L. G., McDougald, S., Revelle, W., & Condon, D. (2020). That takes the BISCUIT:
Including gender-stereotypical behaviors in addition to traits. Psychology of Women
A comparative study of predictive accuracy and parsimony of four statistical learning
Quarterly, 27, 309–318. https://doi.org/10.1111/1471-6402.00111
techniques in personality data, with data missingness conditions. European Journal of
Bainbridge, T. F., Ludeke, S. G., & Smillie, L. D. (2022). Evaluating the big five as an
Psychological Assessment, 36, 948–958. https://doi.org/10.1027/1015-5759/
organizing framework for commonly used psychological trait scales. Journal of
a000590
Personality and Social Psychology, 122, 749–777. https://doi.org/10.1037/
Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8,
pspp0000395
341–349. https://doi.org/10.1037/1040-3590.8.4.341
Bartholomew, D., Deary, I., & Lawn, M. (2009). A new lease of life for Thomson’s bonds
Embretson, S. E., & Hershberger, S. L. (1999). The new rules of measurement: What every
model of intelligence. Psychological Review, 116, 567–579. https://doi.org/10.1037/
psychologist and educator should know. Mahwah, N.J: L. Erlbaum Associates.
a0016262
Eysenck, H. J. (1944). Types of personality: A factorial study of seven hundred neurotics.
Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.
The British Journal of Psychiatry, 90, 851–861. https://doi.org/10.1192/
Bollen, K. A. (2002). Latent variables in psychology and the social sciences. Annual
bjp.90.381.851
Review of Psychology, 53, 605–634. https://doi.org/10.1146/annurev.
Eysenck, H. J. (1952). The scientific study of personality. London: Routledge & K. Paul.
psych.53.100901.135239
Eysenck, H. J. (1953). Uses and abuses of psychology. London, Baltimore: Penguin Books.
Borg, I. (2018). A note on the positive manifold hypothesis. Personality and Individual
Eysenck, H. J. (1964). Sense and nonsense in psychology. Baltimore: Penguin Books.
Differences, 134, 13–15. https://doi.org/10.1016/j.paid.2018.05.041
Eysenck, H. J. (1965). Fact and fiction in psychology. Baltimore: Penguin Books.
Borsboom, D. (2006). The attack of the psychometricians. Psychometrika, 71, 425–440.
Eysenck, H. J. (1967). The biological basis of personality. Springfield: Thomas.
https://doi.org/10.1007/s11336-006-1447-6
Eysenck, H. J. (1990). Biological dimensions of personality. In L. A. Pervin (Ed.),
Borsboom, D., & Mellenbergh, G. J. (2004). Why psychometrics is not pathological.
Handbook of personality: Theory and research (pp. 244–276). New York, NY: Guilford
Theory & Psychology, 14, 105–120. https://doi.org/10.1177/0959354304040200
Press.
Borsboom, D., Mellenbergh, G. J., & Heerden, J. V. (2002). Different kinds of DIF: A
Eysenck, H. J., & Eysenck, M. W. (1985). Personality and individual differences: A natural
distinction between absolute and relative forms of measurement invariance and bias.
science approach. New York: Plenum.
Applied Psychological Measurement, 26, 433–450. https://doi.org/10.1177/
Eysenck, H. J., & Eysenck, S. B. G. (1964). Eysenck Personality Inventory. San Diego,
014662102237798
California: Educational and Industrial Testing Service.
Bravais, A. (1844). Analyse mathématique sur les probabilités des erreurs de situation d’un
Eysenck, H. J., & Himmelweit, H. T. (1947). Dimensions of personality; a record of research
point. Memoires Presentees al’Academie Royale des Sciences de L ’Institut de France.
carried out in collaboration with H.T. Himmelweit [and others]. London: Routledge &
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the
Kegan Paul.
multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105. https://doi.org/
Flynn, J. R. (1984). The mean IQ of Americans: Massive gains 1932 to 1978. Psychological
10.1037/h0046016
Bulletin, 95, 29–51. https://doi.org/10.1037/0033-2909.95.1.29
Carroll, J. B. (1952). Ratings on traits measured by a factored personality inventory. The
Flynn, J. R. (1987). Massive IQ gains in 14 nations: What IQ tests really measure.
Journal of Abnormal and Social Psychology, 47, 626.
Psychological Bulletin, 101, 171–191. https://doi.org/10.1037/0033-2909.101.2.171
Cattell, R. B., & Stice, G. (1957). Handbook for the Sixteen Personality Factor Questionnaire.
Forbes, M. K., Sunderland, M., Rapee, R. M., Batterham, P. J., Calear, A. L., Carragher, N.,
Champaign, Ill: Institute for Ability and Personality Testing.
Ruggero, C., Zimmerman, M., Baillie, A. J., Lynch, S. J., Mewton, L., Slade, T., &
Comrey, A. L. (2008). The Comrey Personality Scales. In G. J. Boyle, G. Matthews, &
Krueger, R. F. (2021). A detailed hierarchical model of psychopathology: From
D. H. Saklowfske (Eds.), Vol. II. Sage handbook of personality theory and testing:
individual symptoms up to the general factor of psychopathology. Clinical
Personality measurement and assessment (pp. 113–134). London: Sage.
Psychological Science, 9, 139–168. https://doi.org/10.1177/2167702620954799
Condon, D. M. (2018). The SAPA Personality Inventory: An empirically-derived,
Galton, F. (1888). Co-relations and their measurement. Proceedings of the Royal Society.
hierarchically-organized self-report personality assessment model. PsyArXiv. https://
London Series, 45, 135–145.
doi.org/10.31234/osf.io/sc4p9
Garner, K. M. (2024). The forgotten trade-off between internal consistency and validity
Condon, D. M. (2019). Database of Individual Differences Survey Tools. Harvard Dataverse.
(abstract). In Multivariate behavioral research (in press).
https://doi.org/10.7910/DVN/T1NQ4V
Goldberg, L. R. (1972). Parameters of personality inventory construction and utilization:
Condon, D. M. (2022). RetestReliability = f(Stability,Memory,Personality) + ε. Presented at
A comparison of prediction strategies and tactics. In Multivariate behavioral research
symposium in honor of Sarah Dubrow.
monographs. no 72-2 7.
Condon, D. M. (2023). In osf.o/da59z (Ed.), Big five replicability. ARP.
Goldberg, L. R. (1990). An alternative “description of personality”: The big-five factor
Condon, D. M., & Revelle, W. (2014). The International Cognitive Ability Resource :
structure. Journal of Personality and Social Psychology, 59, 1216–1229. https://doi.
Development and initial validation of a public-domain measure. Intelligence, 43,
org/10.1037/0022-3514.59.6.1216
52–64. https://doi.org/10.1016/j.intell.2014.01.004
Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., &
Costa, P. T., & McCrae, R. R. (1992). Four ways five factors are basic. Personality and
Gough, H. G. (2006). The international personality item pool and the future of
Individual Differences, 13, 653–665. https://doi.org/10.1016/0191-8869(92)90236-I
public-domain personality measures. Journal of Research in Personality, 40, 84–96.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests.
https://doi.org/10.1016/j.jrp.2005.08.007
Psychometrika, 16, 297–334. https://doi.org/10.1007/BF02310555
Gottlieb, T., Furnham, A., & Klewe, J. B. (2021). Personality in the light of identity,
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests.
reputation and role taking: A review of socioanalytic theory. Psychology, 12,
Psychological Bulletin, 52, 281–302. https://doi.org/10.1037/h0040957
2020–2041. https://doi.org/10.4236/psych.2021.1212123
Cudeck, R., & MacCallum, R. C. (2007). Factor analysis at 100: Historical developments and
Gough, H. G. (1957). Manual for the California Psychological Inventory.
future directions. Mahwah, N.J: Lawrence Erlbaum Associates.
Gough, H. G. (1960). The adjective check list as a personality assessment research
Cureton, E. E. (1950). Validity, reliability, and baloney. Educational and Psychological
technique. Psychological Reports, 6, 107–122. https://doi.org/10.2466/
Measurement, 10, 94–96. https://doi.org/10.1177/001316445001000107
pr0.1960.6.1.107
Cutler, A., & Condon, D. M. (2023). Deep lexical hypothesis: Identifying personality
Gough, H. G. (1965). Conceptual analysis of psychological test scores and other
structure in natural language. Journal of Personality and Social Psychology, 125,
diagnostic variables. Journal of Abnormal Psychology, 70, 294–302. https://doi.org/
173–197. https://doi.org/10.1037/pspp0000443
10.1037/h0022397
Dawis, R. V. (1992). The individual differences tradition in counseling psychology.
Gray, J. A. (1981). A critique of Eysenck’s theory of personality. In H. J. Eysenck (Ed.),
Journal of Counseling Psychology, 39, 7–19. https://doi.org/10.1037/0022-
A model for personality (pp. 246–277). Berlin: Springer.
0167.39.1.7
Gray, J. A. (1987). Perspectives on anxiety and impulsivity: A commentary. Journal of
Deary, I. J. (2001). Intelligence: A very short introduction. OUP Oxford.
Research in Personality, 21, 493–509. https://doi.org/10.1016/0092-6566(87)
Deary, I. J. (2009). Introduction to the special issue on cognitive epidemiology.
90036-5
Intelligence, 37, 517–519. https://doi.org/10.1016/j.intell.2009.05.001
Gruber, F. M., Distlberger, E., Scherndl, T., Ortner, T. M., & Pletzer, B. (2020).
Digman, J. M. (1990). Personality structure: Emergence of the five-factor model. Annual
Psychometric properties of the multifaceted gender-related attributes survey
Review of Psychology, 41, 417–440. https://doi.org/10.1146/annurev.
(GERAS). European Journal of Psychological Assessment, 36, 612–623. https://doi.org/
ps.41.020190.002221
10.1027/1015-5759/a000528
Digman, J. M. (1997). Higher-order factors of the big five. Journal of Personality and
Guilford, J. P. (1940). Inventory of factors STDCR. Beverly Hills, Calif: Sheridan Supply
Social Psychology, 73, 1246–1256. https://doi.org/10.1037/0022-3514.73.6.1246
Co.
Donnay, D., Morris, M., Schaubhut, N., & Thompson, R. (2005). Strong interest inventory
Guilford, J. P. (1954). Psychometric methods (2nd ed.). New York: McGraw-Hill.
manual (rev. ed.). Palo Alto: Consulting Psychologists Press, Inc.
Guilford, J. P. (1956). The structure of intellect. Psychological Bulletin, 53, 267–293.
Donnay, D. A. (1997). E.K. Strong’s legacy and beyond: 70 years of the Strong interest
https://doi.org/10.1037/h0040755
inventory. Career Development Quarterly, 46, 2–22. https://doi.org/10.1002/j.2161-
Guilford, J. P. (1975). Factors and factors of personality. Psychological Bulletin, 82,
0045.1997.tb00688.x
802–814. https://doi.org/10.1037/h0077101
Donnay, D. A., & Borgen, F. H. (1996). Validity, structure, and content of the 1994 strong
Gulliksen, H., 1950. Theory of mental tests. John Wiley & Sons, Inc.
interest inventory. Journal of Counseling Psychology, 43, 275–291 (doi: 0022-0167/
96).

15
W. Revelle Personality and Individual Differences 221 (2024) 112552

Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10, McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, N.J: L. Erlbaum
255–282. https://doi.org/10.1007/BF02288892 Associates.
Hase, H. D., & Goldberg, L. R. (1967). Comparative validity of different strategies of Möttus, R., Wood, D., Condon, D. M., Back, M. D., Baumert, A., Costantini, G.,
constructing personality inventory scales. Psychological Bulletin, 67, 231–248. Epskamp, S., Greiff, S., Johnson, W., Lukaszewski, A., Murray, A., Revelle, W.,
https://doi.org/10.1037/h0024421 Wright, A. G., Yarkoni, T., Ziegler, M., & Zimmermann, J. (2020). Descriptive,
Hathaway, S., & McKinley, J. (1943). Manual for administering and scoring the MMPI. predictive and explanatory personality research: Different goals, different
Henry, S., Thielmann, I., Booth, T., & Mõttus, R. (2022). Test-retest reliability of the approaches, but a shared need to move beyond the big few traits. European Journal of
hexaco-100—And the value of multiple measurements for assessing reliability. PLoS Personality, 34, 1175–1201. https://doi.org/10.1002/per.2311
One, 17, 1–14. https://doi.org/10.1371/journal.pone.0262465 Musek, J. (2007). A general factor of personality: Evidence for the big one in the five-
Herrnstein, R. J., & Murray, C. (2010). The Bell Curve: Intelligence and class structure in factor model. Journal of Research in Personality, 41, 1213–1233. https://doi.org/
American life. Simon and Schuster. 10.1016/j.jrp.2007.02.003
Hogan, R. (1982). A socioanalytic theory of personality. In Nebraska Symposium on Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
Motivation (pp. 55–89). University of Nebraska Press. Park, S. K., Tao, Y., Meeker, J. D., Harlow, S. D., & Mukherjee, B. (2014). Environmental
Hogan, R. (2009). John Holland. URL: https://www.hoganassessments.com/blog/john risk score as a new tool to examine multi-pollutants in epidemiologic research: An
-holland/. example from the nhanes study using serum lipid levels. PLoS One, 9, Article e98632.
Hogan, R., Blickle, G., 2018. Socioanalytic theory: Basic concepts, supporting evidences, https://doi.org/10.1371/journal.pone.0098632
and practical implications, in: Shackelford, V.Z.H..T.K. (Ed.), The SAGE handbook of Peabody, D. (1967). Trait inferences: Evaluative and descriptive aspects. Journal of
personality and individual differences: The science of personality and individual Personality and Social Psychology, 7, 1–18. https://doi.org/10.1037/h0025230
differences. Sage reference Vol. 1, pp. 110–129. doi: https://doi.org/10.4135/9781 Pearson, K. (1896). Mathematical contributions to the theory of evolution. III.
526451163.n5. Regression, heredity, and panmixia. Philisopical Transactions of the Royal Society of
Hogan, R., Curphy, G. J., & Hogan, J. (1994). What we know about leadership: London. Series A, 187, 254–318. https://doi.org/10.1098/rsta.1896.0007
Effectiveness and personality. American Psychologist, 49, 493–504. https://doi.org/ Plato, n.d. Plato The Republic : The complete and unabridged Benjamin Jowett
10.1037/0003-066X.49.6.493 translation (1892). 3rd ed., Oxford Univeristy Press, Oxford.
Hogan, R., & Hogan, J. (1995). The Hogan personality inventory manual (2nd. ed.). Tulsa, R Core Team. (2023). R: A language and environment for statistical computing. Vienna,
OK: Hogan Assessment Systems. Austria: R Foundation for Statistical Computing. URL: https://www.R-project.org/.
Hogan, R., Hogan, J., & Roberts, B. W. (1996). Personality measurement and Reise, S. (1999). Personality measurement issues viewed through the eyes of IRT. In
employment decisions: Questions and answers. American Psychologist, 51, 469–477. S. E. Embretson, & S. L. Hershberger (Eds.), The new rules of measurement: What every
https://doi.org/10.1037/0003-066X.51.5.469 psychologist and educator should know (pp. 219–241). Mahwah, N.J: Lawrence
Hogan, R., & Sherman, R. A. (2020). Personality theory and the nature of human nature. Erlbaum Associates.
Personality and Individual Differences, 152, Article 109561. https://doi.org/10.1016/ Revelle, W. (1983). Factors are fictions, and other comments on individuality theory.
j.paid.2019.109561 Journal of Personality, 51, 707–714. https://doi.org/10.1111/1467-6494.ep7380795
Holland, J. L. (1959). A theory of vocational choice. Journal of Counseling Psychology, 6, Revelle, W. (1989). Personality theory is alive and well and living in europe.
35–45. https://doi.org/10.1037/h0040767 Contemporary Psychology: APA Review of Books, 34, 235–236. https://doi.org/
Holland, J. L. (1996). Exploring careers with a typology: What we have learned and some 10.1037/027760
new directions. American Psychologist, 51, 397–406. https://doi.org/10.1037/0003- Revelle, W. (2023a). psych: Procedures for psychological, psychometric, and personality
066X.51.4.397 research (2.3.9 ed.). Evanston: Northwestern University https://CRAN.r-project.
Howell, R. D., Breivik, E., & Wilcox, J. B. (2007). Reconsidering formative measurement. org/package=psych (R package version 2.3.9).
Psychological Methods, 12, 205–218. https://doi.org/10.1037/1082-989X.12.2.205 Revelle, W. (2023b). psychTools tools to accompany the psych package for psychological
Jensen, A. R. (1969). How much can we boost iq and scholastic achievement. Harvard research. Evanston: Northwestern University (psychTools. R package version 2.3.9).
Educational Review, 39, 1–123. https://doi.org/10.17763/haer.39.1. Revelle, W., & Condon, D. (2023). Using unidim rather than omega in estimating
l3u15956627424k7 undimensionality (submitted).
Jensen, A. R. (1998). The g factor: The science of mental ability. Westport, CT: Prager. Revelle, W., Dworak, E. M., & Condon, D. M. (2020). Cognitive ability in everyday life:
Jensen, A. R., & Weng, L. J. (1994). What is a good g? Intelligence, 18, 231–258. https:// The utility of open source measures. Current Directions in Psychological Science, 29,
doi.org/10.1016/0160-2896(94)90029-9 358–363. https://doi.org/10.1177/0963721420922178
Johnson, W., Brett, C. E., & Deary, I. J. (2010). The pivotal role of education in the Revelle, W., Dworak, E. M., & Condon, D. M. (2021). Exploring the persome: The power
association between ability and social class attainment: A look across three of the item in understanding personality structure. Personality and Individual
generations. Intelligence, 38, 55–65. https://doi.org/10.1016/j.intell.2009.11.008 Differences, 169. https://doi.org/10.1016/j.paid.2020.109905
Jonas, K. G., & Markon, K. E. (2016). A descriptivist approach to trait conceptualization Revelle, W., Ellman, L.G., 2016. Factors are still fictions [peer commentary on “towards
and inference. Psychological Review, 123, 90–96. https://doi.org/10.1037/a0039542 more rigorous personality trait–outcome research,” by R. Mõttus]. European Journal
Jöreskog, K. G. (1978). Structural analysis of covariance and correlation matrices. of Personality 30, 324–325.
Psychometrika, 43, 443–477. https://doi.org/10.1007/BF02293808 Revelle, W., & Garner, K. M. (2023). Measurement: Reliability, construct validation, and
Kovacs, K., & Conway, A. R. (2019). A unified cognitive/differential approach to human scale construction. In T. Harry, T. W. Reis, & C. M. Judd (Eds.), Handbook of research
intelligence: Implications for iq testing. Journal of Applied Research in Memory and methods in social and personality psychology (in press).
Cognition, 8, 255–272. https://doi.org/10.1016/j.jarmac.2019.05.003 Revelle, W., & Wilt, J. (2013). The general factor of personality: A general critique.
Kovacs, K., & Conway, A. R. A. (2016). Process overlap theory: A unified account of the Journal of Research in Personality, 47, 493–504. https://doi.org/10.1016/j.
general factor of intelligence. Psychological Inquiry, 27, 151–177. https://doi.org/ jrp.2013.04.012
10.1080/1047840X.2016.1153946 Revelle, W., Wilt, J., & Condon, D. (2011). Individual differences and differential
Krueger, R. F., & Markon, K. E. (2006a). Reinterpreting comorbidity: A model-based psychology: A brief history and prospect. In T. Chamorro-Premuzic, A. Furnham, &
approach to understanding and classifying psychopathology. Annual Review of S. von Stumm (Eds.), Handbook of individual differences (pp. 3–38). Oxford: Wiley-
Clinical Psychology, 2, 111–133. https://doi.org/10.1146/annurev. Blackwell.
clinpsy.2.022305.095213 Roberts, B. W., Kuncel, N. R., Shiner, R., Caspi, A., & Goldberg, L. R. (2007). The power
Krueger, R. F., & Markon, K. E. (2006b). Understanding psychopathology: Melding of personality: The comparative validity of personality traits, socioeconomic status,
behavior genetics, personality, and quantitative psychology to develop an and cognitive ability for predicting important life outcomes. Perspectives on
empirically based model. Current Directions in Psychological Science, 15, 113–117. Psychological Science, 2, 313–345. https://doi.org/10.1111/j.1745-6916.2007.000
https://doi.org/10.1111/j.0963-7214.2006.0041 Royce, J. R. (1983). Personality integration: A synthesis of the parts and wholes of
Kuder, G., & Richardson, M. (1937). The theory of the estimation of test reliability. individuality theory. Journal of Personality, 51, 683–706. https://doi.org/10.1111/
Psychometrika, 2, 151–160. https://doi.org/10.1007/BF02288391 j.1467-6494.1983.tb00874.x
Lee, J. J., Wedow, R., Okbay, A., et al. (2018). Gene discovery and polygenic prediction Schaeffer, N. C. (1988). An application of item response theory to the measurement of
from a genome-wide association study of educational attainment in 1.1 million depression. Sociological Methodology, 18, 271–307. URL: http://www.jstor.org/stab
individuals. Nature Genetics, 50, 1112–1121. https://doi.org/10.1038/s41588-018- le/271051.
0147-3 Sijtsma, K. (2009a). On the use, the misuse, and the very limited usefulness of
Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Cronbach’s alpha. Psychometrika, 74, 107–120. https://doi.org/10.1007/s11336-
Reports Monograph Supplement, 9(3), 635–694. https://doi.org/10.2466/ 008-9101-0
pr0.1957.3.3.635 Sijtsma, K. (2009b). Reliability beyond theory and into practice. Psychometrika, 74,
Markon, K. E., Krueger, R. F., & Watson, D. (2005). Delineating the structure of normal 169–173. https://doi.org/10.1007/s11336-008-9103-y
and abnormal personality: An integrative hierarchical approach. Journal of Slaney, K. (2017). Historical precursors and early testing theory. London: Palgrave
Personality and Social Psychology, 88, 139–157. https://doi.org/10.1037/0022- Macmillan UK. https://doi.org/10.1057/978-1-137-38523-9_2
3514.88.1.139 Spearman, C., 1904a. “General Intelligence,” objectively determined and measured.
Marschak, J. (1954). Probability in the social sciences. In P. Lazarfeld (Ed.), Mathematical American Journal of Psychology 15, 201–292. doi: https://doi.org/10.2307/141210
thinking in the social sciences (pp. 166–215). Free Press. 7.
McCrae, R. R. (2014). A more nuanced view of reliability: Specificity in the trait Spearman, C. (1904b). The proof and measurement of association between two things.
hierarchy. Personality and Social Psychology Review, 19, 97–112. https://doi.org/ The American Journal of Psychology, 15, 72–101. https://doi.org/10.2307/1412159
10.1177/1088868314541857 Steinberg, L., & Thissen, D. (2006). Using effect sizes for research reporting: Examples
McCrae, R. R., Kurtz, J. E., Yamagata, S., & Terracciano, A. (2011). Internal consistency, using item response theory to analyze differential item functioning. Psychological
retest reliability, and their implications for personality scale validity. Personality and Methods, 11, 402–415. https://doi.org/10.1037/1082-989X.11.4.402
Social Psychology Review, 15, 28–50. https://doi.org/10.1177/1088868310366253

16
W. Revelle Personality and Individual Differences 221 (2024) 112552

Stewart, R. D., Mõttus, R., Seeboth, A., Soto, C. J., & Johnson, W. (2022). The finer Watts, A. L., Greene, A. L., Bonfifay, W., & Fried, E. I. (2023). A critical evaluation of the p-
details? The predictability of life outcomes from big five domains, facets, and factor literature. PsyArXiv 7yrnp/. https://doi.org/10.31234/osf.io/7yrnp
nuances. Journal of Personality, 90, 167–182. https://doi.org/10.1111/jopy.12660 Webb, E. (1915). Character and intelligence: An attempt at an exact study of character.
Strong, E. K., Jr. (1927). Vocational interest test. Educational Record, 8, 107–121. The British Journal of Psychology, Monograph Supplements I.
Strong, E. K., Jr. (1947). Vocational interests of men and women. Stanford University Press. Widaman, K. F., & Revelle, W. (2023a). Thinking thrice about sum scores, and then some
Su, R., Tay, L., Liao, H. Y., Zhang, Q., & Rounds, J. (2019). Toward a dimensional model more about measurement and analysis. Behavior Research Methods, 55, 788–806.
of vocational interests. Journal of Applied Psychology, 104, 690. https://doi.org/ https://doi.org/10.3758/s13428-022-01849-w
10.1037/apl0000373 Widaman, K. F., & Revelle, W. (2023b). Thinking about sum scores yet again, maybe the
Thomson, G. H. (1916). A hierarchy without a general factor. British Journal of last time, we don’t know, oh no…: A comment on McNeish (2023). Educational and
Psychology, 8, 271–281. Psychological Measurement, 0, Article 00131644231205310. https://doi.org/
Thomson, G. H. (1935). The definition and measurement of “g” (general intelligence). 10.1177/00131644231205310
Journal of Educational Psychology, 26, 241–262. https://doi.org/10.1037/h0059873 Wiley, D. E. (1973). The identification problem for structural equation models with
Thurstone, L. L. (1934). The vectors of mind. Psychological Review, 41, 1. https://doi.org/ unmeasured variables. In A. S. Goldberger, & O. D. Duncan (Eds.), Structural equation
10.1037/h0075959 models in the social sciences (pp. 69–83). New York: Seminar Press.
Thurstone, L. L. (1935). The vectors of mind: Multiple-factor analysis for the isolation of Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology:
primary traits. Chicago: Univ. of Chicago Press. Lessons from machine learning. Perspectives on Psychological Science, 12, 1100–1122.
Vassos, E., Sham, P., Kempton, M., Trotta, A., Stilo, S. A., Gayer-Anderson, C., … https://doi.org/10.1177/1745691617693393
Morgan, C. (2020). The Maudsley environmental risk score for psychosis. Zola, A., Condon, D. M., & Revelle, W. (2021). The convergence of self and informant
Psychological Medicine, 50, 2213–2220. reports in a large online sample. Collabra. Psychology, 7. https://doi.org/10.1525/
collabra.25983

Psychology Class Xii Notes: Chapter-1 Variations in Psychological Attributes
95% (19)
Psychology Class Xii Notes: Chapter-1 Variations in Psychological Attributes
23 pages
Personality and Individual Differences - A Natural Science - Eysenck, H - J - (Hans Jurgen), 1916-1997 Eysenck, Michael W - 1985 - New York - Plenum - 9780306418440 - F24444878487b2d273a9b61aabaf3
No ratings yet
Personality and Individual Differences - A Natural Science - Eysenck, H - J - (Hans Jurgen), 1916-1997 Eysenck, Michael W - 1985 - New York - Plenum - 9780306418440 - F24444878487b2d273a9b61aabaf3
456 pages
Grade 3 Maths Worksheet
100% (7)
Grade 3 Maths Worksheet
5 pages
Chapter 2
No ratings yet
Chapter 2
6 pages
Borsboom, D., Mellenbergh, G. J., Van Heerden, J. (2003) .
No ratings yet
Borsboom, D., Mellenbergh, G. J., Van Heerden, J. (2003) .
17 pages
Agent Analyst
No ratings yet
Agent Analyst
559 pages
SEAL Programme Sample Test
50% (2)
SEAL Programme Sample Test
4 pages
Multiple Choice Questions Class Viii: Geometry
No ratings yet
Multiple Choice Questions Class Viii: Geometry
29 pages
Hans Eysenck 1
No ratings yet
Hans Eysenck 1
11 pages
Hans Eysenck's Personality Theories
No ratings yet
Hans Eysenck's Personality Theories
4 pages
Psychological Tests
No ratings yet
Psychological Tests
35 pages
Hans Eysenck
No ratings yet
Hans Eysenck
4 pages
Chapter 14 Top
No ratings yet
Chapter 14 Top
23 pages
Hans Eysenck Notes
No ratings yet
Hans Eysenck Notes
6 pages
Assessment of Personality
No ratings yet
Assessment of Personality
5 pages
Investigating Psychology - Chapter 1
No ratings yet
Investigating Psychology - Chapter 1
42 pages
Individual Differences and Personality - (5 Factor Analysis)
No ratings yet
Individual Differences and Personality - (5 Factor Analysis)
24 pages
Personality: - Examples
No ratings yet
Personality: - Examples
5 pages
Four Fold Flowering
No ratings yet
Four Fold Flowering
3 pages
Sackett, JAP, 17
No ratings yet
Sackett, JAP, 17
21 pages
Chapter 2 - Personality Processes From D - 2021 - The Handbook of Personality Dy
No ratings yet
Chapter 2 - Personality Processes From D - 2021 - The Handbook of Personality Dy
23 pages
Personality Psychology
No ratings yet
Personality Psychology
40 pages
Leontev5activity Personality
No ratings yet
Leontev5activity Personality
24 pages
Big 5 Reviewer
No ratings yet
Big 5 Reviewer
14 pages
Lecture 2
No ratings yet
Lecture 2
26 pages
Personality Assessment
100% (1)
Personality Assessment
46 pages
1 Variations in Psychological Attributes Docx 1
No ratings yet
1 Variations in Psychological Attributes Docx 1
23 pages
ZKPQ 9
No ratings yet
ZKPQ 9
12 pages
Psychology
No ratings yet
Psychology
28 pages
IV. Issues and Trends in Psychological Testing
100% (1)
IV. Issues and Trends in Psychological Testing
9 pages
Test Measurement and Evaluation PRELIM REVIEWER
No ratings yet
Test Measurement and Evaluation PRELIM REVIEWER
19 pages
Introduction To Psychological Testing
100% (1)
Introduction To Psychological Testing
63 pages
ToP Seminar11
No ratings yet
ToP Seminar11
21 pages
XII Psychology CH 1
No ratings yet
XII Psychology CH 1
23 pages
Personality Test
No ratings yet
Personality Test
5 pages
CH 3 Outline
No ratings yet
CH 3 Outline
7 pages
T.O.P. Eysenck Biologically Based Factor Theory Reviewer
No ratings yet
T.O.P. Eysenck Biologically Based Factor Theory Reviewer
4 pages
Individual Differences E: Mphil/Phd Student, MSC, BSC in Psychology. Pablo Pérez-Díaz
No ratings yet
Individual Differences E: Mphil/Phd Student, MSC, BSC in Psychology. Pablo Pérez-Díaz
23 pages
Psych Testing Practice Test
No ratings yet
Psych Testing Practice Test
7 pages
Personality Assessment
No ratings yet
Personality Assessment
24 pages
Special Topics Psych Ass Lec1
No ratings yet
Special Topics Psych Ass Lec1
9 pages
Chapter 5 - Personality Psychology
No ratings yet
Chapter 5 - Personality Psychology
7 pages
Domains of Personality
No ratings yet
Domains of Personality
28 pages
Differential Psychology Encyclop
No ratings yet
Differential Psychology Encyclop
5 pages
Chapter 1 Notes
No ratings yet
Chapter 1 Notes
17 pages
Psychological Assessment
No ratings yet
Psychological Assessment
15 pages
Personality Theories
No ratings yet
Personality Theories
17 pages
PEMBSADA PSY4 Eysenck
No ratings yet
PEMBSADA PSY4 Eysenck
4 pages
An Abilities Conception of Personality - Some Implications For Personality Measurement. John Wallace
No ratings yet
An Abilities Conception of Personality - Some Implications For Personality Measurement. John Wallace
7 pages
JASAR KHAN Practical III EPQR
100% (2)
JASAR KHAN Practical III EPQR
12 pages
II. History of Psychological Assessment
No ratings yet
II. History of Psychological Assessment
28 pages
5 Handout Psyc-Assessment
No ratings yet
5 Handout Psyc-Assessment
3 pages
Personality Psychology 1
No ratings yet
Personality Psychology 1
6 pages
Molenaar (2004) - Manifesto
No ratings yet
Molenaar (2004) - Manifesto
19 pages
2021 Key Ideas From Lecture Behavioral Psychology
No ratings yet
2021 Key Ideas From Lecture Behavioral Psychology
9 pages
Reference Material
No ratings yet
Reference Material
120 pages
PSYCHOLOGY LAB - Personality Assessment
No ratings yet
PSYCHOLOGY LAB - Personality Assessment
9 pages
Variables, Research Designs
No ratings yet
Variables, Research Designs
7 pages
Chapter 12
No ratings yet
Chapter 12
14 pages
4a Hans Eysenck 12032024 093808am
No ratings yet
4a Hans Eysenck 12032024 093808am
24 pages
Challenging the Unchallengeable: Einstein’S Theory of Special Relativity
From Everand
Challenging the Unchallengeable: Einstein’S Theory of Special Relativity
John D. Frey
No ratings yet
Comments on Egil Asprem and Ann Taves’s Essay (2018) "Explanation and the Study of Religion"
From Everand
Comments on Egil Asprem and Ann Taves’s Essay (2018) "Explanation and the Study of Religion"
Razie Mah
No ratings yet
Comments on Janice Breidenbach’s Essay (2018) "Action, Agency, and Substance Causation": Peirce's Secondness and Aristotle's Hylomorphism, #19
From Everand
Comments on Janice Breidenbach’s Essay (2018) "Action, Agency, and Substance Causation": Peirce's Secondness and Aristotle's Hylomorphism, #19
Razie Mah
No ratings yet
Causal Theories of Reference for Proper Names
From Everand
Causal Theories of Reference for Proper Names
Nicolae Sfetcu
No ratings yet
Y8 SOW Revision Guide
No ratings yet
Y8 SOW Revision Guide
6 pages
Hoffman Calculus 11 2
No ratings yet
Hoffman Calculus 11 2
8 pages
Chapter 3
No ratings yet
Chapter 3
8 pages
SDF103 - Data Structures and Applications
No ratings yet
SDF103 - Data Structures and Applications
4 pages
CBSE Class XII - Mathematics Question Paper
No ratings yet
CBSE Class XII - Mathematics Question Paper
6 pages
Wave Questions
No ratings yet
Wave Questions
11 pages
Suwanasri Thanapong
No ratings yet
Suwanasri Thanapong
137 pages
Unified Modeling Language (UML)
100% (2)
Unified Modeling Language (UML)
24 pages
Module 7 Notes Parallologram
No ratings yet
Module 7 Notes Parallologram
8 pages
Keywords: Open Ended, Creative Thinking, SPLDV: Pendahuluan A. Latar Belakang
No ratings yet
Keywords: Open Ended, Creative Thinking, SPLDV: Pendahuluan A. Latar Belakang
13 pages
DSP Lab Manual
No ratings yet
DSP Lab Manual
57 pages
Outline Een 407
No ratings yet
Outline Een 407
5 pages
2024 LLRW Maths Paper I Solutions
No ratings yet
2024 LLRW Maths Paper I Solutions
16 pages
Bochner - Serial Attitude (1967) PDF
No ratings yet
Bochner - Serial Attitude (1967) PDF
6 pages
Week 4-13 To 4-20 Geometry Study Guide 9-6
No ratings yet
Week 4-13 To 4-20 Geometry Study Guide 9-6
2 pages
DAA Mini Project
No ratings yet
DAA Mini Project
20 pages
KSSR Mathematics Structure
No ratings yet
KSSR Mathematics Structure
8 pages
RoboAnalyzerUserManual PDF
No ratings yet
RoboAnalyzerUserManual PDF
22 pages
Unit I
100% (1)
Unit I
27 pages
Aastha Tripathy
No ratings yet
Aastha Tripathy
2 pages
Conditional Probability and Expectation
No ratings yet
Conditional Probability and Expectation
19 pages
Disaggregated Imports Demand Functions
No ratings yet
Disaggregated Imports Demand Functions
22 pages
Mathematics: Name: Class: Class No.: Date: Marks: Exercise 3 Basic Mathematics
No ratings yet
Mathematics: Name: Class: Class No.: Date: Marks: Exercise 3 Basic Mathematics
2 pages
Extrusion With EFG Method
No ratings yet
Extrusion With EFG Method
12 pages
Final Exam CONCEPTS AND DYNAMICS OF MANAGEMENT
No ratings yet
Final Exam CONCEPTS AND DYNAMICS OF MANAGEMENT
3 pages
Robotics Lab Manual
100% (3)
Robotics Lab Manual
26 pages

The Seductive Beauty of Latent Variable Models

Uploaded by

The Seductive Beauty of Latent Variable Models

Uploaded by

Personality and Individual Differences 221 (2024) 112552

Contents lists available at ScienceDirect

Personality and Individual Differences

The seductive beauty of latent variable models: Or why I don't believe in

To receive an award for a lifetime contribution to the study of in­ psychology.

Available online 29 January 2024

subsequently Pearson (1896) to be understandable to psychologists

1′cc′1 looks to the data to suggest the psychological structure, recognizing

5. Structure of ability and temperament

One of Spearman's great contributions was the recognition of the

Although in the late 1960s, some Americans thought personality did

health 1 3536 3.51 0.98 4 3.54 1.48 1 5 4 − 0.25 −0.42 0.02

(Intercept) 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Correlations with exer

q_1024 − 0.24 Hang around doing nothing. EasyGoingness

q_820 0.38 Feel comfortable with myself. WellBeing

Appendix A. R code for analyses

You might also like

To receive an award for a lifetime contribution to the study of in psychology.