the metaphysics of latent variables in psychology

(Washington, DC) The search for “latent variables” is so common in psychology that I would almost call it definitive of the discipline today. Other disciplines also study people’s thoughts and actions, but the distinctive contribution of psychology seems to be the use of variables that are not directly observed but rather inferred from data. Latent variables have been “so useful … that they pervade … psychology and the social sciences” (Bollen, 2002, p. 606).

But what are they? This is a metaphysical question, in the sense that contemporary, professional, Anglophone philosophers use the word “metaphysics.” It doesn’t mean that latent variables are spooky or illusory, but rather that it’s worth trying to figure out what kinds of things they are and how they relate to other sorts of things, such as beliefs, observations, numbers, mental states, processes, physical brains, etc. (Cf. why social scientists should pay attention to metaphysics.)

It turns out (Mulaik, 1987) that the main tools of psychometrics were invented by early-20th century thinkers who were explicitly interested in philosophical issues. For instance, Karl Pearson, who invented P-values, chi-square tests, and Principal Components Analysis–and who first used a histogram–wrote a book about philosophy of science before he developed these tools in order to implement his philosophy. He sounds like an awful man–an active proponent of racism–but that doesn’t invalidate his contributions to statistics. Their origin in his philosophical thought does, however, reinforce the point that latent factors need a philosophical explanation.

In very general terms, a latent variable is a number derived from several direct observations (the manifest variables) and used to say something meaningful about the subject. A history test provides a simple example. The student’s answers to each question are manifest variables. The student’s grade is derived from them, usually by just calculating the percentage of correct answers, and it is supposed to measure “knowledge of history,” which is latent. Only if the test is designed according to the best statistical principals is the overall grade indeed a valid measure of knowledge.

The same example can be used to illustrate a more sophisticated tool, factor analysis. Suppose that any student’s chance of answering a given question on the history test can be predicted fairly well by a function of several measured variables (the student’s family income, the teacher’s background, the amount of time studying history, etc.) plus X, plus Y. X and Y correlate with the answers, but X and Y are not correlated with each other, and they remain constant for each student.

That much might be a mathematical result: a function that roughly matches the actual data. The question then arises: what do X and Y mean? Suppose that X has a very strong correlation with students’ performance on questions that involve difficult reading assignments, such as original source material. And Y has a very strong correlation with students’ performance on questions that involve concrete factual information, such as the dates of the Civil War. Assuming that X and Y are not correlated, we can conclude that history test scores involve two “factors”: reading ability and memorization of concrete factual information. That interpretation would likely be presented as a meaningful finding, with implications for how educators should teach history.*

I don’t disagree. I am involved in this kind of research myself (albeit usually contributing less than my fair share of the math). But what kind of a thing is “reading ability” or “memorization of concrete factual information” in this example?

They are not exactly causes of the students’ actual answers to questions, for four reasons.

First, it is often (always?) possible to describe any given set of data with multiple functions.

Second, given a mathematical function that well describes a given set of data–such as the kids’ specific answers to Mr. Brown’s AP history test–it doesn’t follow that the same factors would also describe another set of data. The next 10 kids who took Mr. Brown’s test might not fit the function at all. This is an example of the general problem of induction.

Third, we can often switch the direction of the explanatory arrow. Instead of using the student’s latent ability in reading to explain or predict her answers to specific test questions, we could use her answers to those questions to explain or predict her reading ability. If you can switch the direction of an explanation, it doesn’t seem like a causal thesis.

Finally, we don’t usually describe a “cause” as something that is derived mathematically from the effects. A student’s family income might be postulated as a cause of her test scores–although it would require an experiment to assess this hypothesis–but a variable that is derived from the test data itself doesn’t seem to be a cause of it. Mulaik (p. 300) writes, “causes generally are not strictly determinate from effects but rather must be distinct from what they explain.”

If you are a strict inductive empiricist, in the tradition of David Hume, you don’t believe that anything is real except for direct observations. That means there are no causes. But it is possible to generalize based on what you have observed so far. Statistics is just a more refined toolkit for the kind of generalizations that we perform naturally when we observe, for instance, that kids tend to perform better on a test if they study for it. This is one way to make sense of a latent variable. It is a sophisticated version of ordinary induction. However, pure inductivism has been criticized on numerous grounds.

A different view is that some kind of mental process or activity causes people to do things like score well on a given history test question. For instance, memorizing dates increases your odds of correctly answering questions on a history test. We can tell a causal story: the information enters the brain, is stored, and is then retrieved to answer the question. The latent variable that correlates with test scores is an indication of this process. (But see Robert Epstein arguing in Aeon against the storage metaphor for human memory.)

In any case, the mathematics of factor analysis would not explain that this is what’s going on. It would only very roughly suggest a phenomenon that requires causal explanation. And although it is fairly straightforward to infer a causal relationship in this case–you should study in order to do well on a test–it is much less plausible that other factors are causal. For instance, do the Big Five Personality traits “cause” answers to concrete questions about emotions and behavior?  In 1939, Wilson and Worcester (quoted in Mulaik) asked, “Why should there be any particular significance psychologically to that vector of the mind which has the property that the sum of squares of the projections of a set of unit vectors (tests) along it be maximum?”

Another level of challenge is that the data for any latent variable come from observations that someone has designed and selected. For instance, that history test could have included entirely different questions. Or we could give tests on reading but not on history. The resulting factors would look different. Some conception of what’s important underlies the design of the test in the first place.

This is what I’m inclined to propose: latent variables are numbers inferred from data. We give them names that refer to actual things that are very heterogeneous, metaphysically. Sometimes latent variables suggest causal theories, although causation requires other kinds of evidence to test. Sometimes they are descriptions of patterns in the accumulated data that are not causal at all. Sometimes they are just tools that are useful for practical reasons–for instance, a kid needs one grade in history instead of a whole bunch of numbers. Whether that grade is appropriate is partly a question of fairness, partly a question about what is valuable to learn, and partly a question of the pragmatic consequences (e.g. does this kind of test cause kids to learn well?). It is only partly a statistical question.

*The example I am informally describing here involves exploratory factor analysis. You identify factors based on pure math and name them based on a theory. On the other hand, in confirmatory factor analysis, you hypothesize a relationship based on a theory and look for patterns in the data that support or reject it. The math is somewhat different, as is the theoretical framework. I don’t want to go too deeply into that contrast because my topic here is broader than factor analysis. I am interested in uses of all latent variables.

Sources: Bollen, Kenneth A. 2002. Latent Variables in Psychology and the Social Sciences. Annual Review of Psychology, vol. 53, 605-634; Mulaik, Stanley A. “A brief history of the philosophical foundations of exploratory factor analysis.” Multivariate Behavioral Research 22.3 (1987): 267-305.


This entry was posted in Uncategorized on by .

About Peter

Associate Dean for Research and the Lincoln Filene Professor of Citizenship and Public Affairs at Tufts University's Tisch College of Civic Life. Concerned about civic education, civic engagement, and democratic reform in the United States and elsewhere.