{"id":17118,"date":"2016-07-12T10:53:31","date_gmt":"2016-07-12T14:53:31","guid":{"rendered":"http:\/\/peterlevine.ws\/?p=17118"},"modified":"2016-07-12T11:06:37","modified_gmt":"2016-07-12T15:06:37","slug":"the-metaphysics-of-latent-variables-in-psychology","status":"publish","type":"post","link":"https:\/\/peterlevine.ws\/?p=17118","title":{"rendered":"the metaphysics of latent variables in psychology"},"content":{"rendered":"<p>(Washington, DC)\u00a0The search for &#8220;latent variables&#8221; is so common in psychology that I would almost call it definitive of the discipline today. Other disciplines also study people&#8217;s thoughts and actions, but the distinctive contribution of\u00a0psychology seems to be the use of variables that are not directly observed but rather inferred from data.\u00a0Latent variables\u00a0have been \u201cso useful \u2026 that they pervade \u2026 psychology and the social sciences\u201d (Bollen, 2002, p. 606).<\/p>\n<p>But what\u00a0<em>are<\/em> they? This\u00a0is a metaphysical question, in the sense that contemporary, professional, Anglophone philosophers use the word &#8220;metaphysics.&#8221; It doesn&#8217;t mean that latent variables\u00a0are spooky\u00a0or illusory, but rather that it&#8217;s worth trying to figure out what kinds of things they are and how they relate to other sorts of things, such as beliefs, observations, numbers, mental states, processes, physical brains, etc. (Cf.\u00a0<a href=\"http:\/\/peterlevine.ws\/?p=15620\" rel=\"bookmark\">why social scientists should pay attention to metaphysics<\/a>.)<\/p>\n<p>It turns out (Mulaik, 1987) that the main tools of psychometrics were invented by early-20th century thinkers who were explicitly interested in philosophical issues. For instance, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Karl_Pearson\">Karl Pearson<\/a>, who invented P-values, chi-square tests, and Principal Components Analysis&#8211;and who first used a histogram&#8211;wrote a <a href=\"https:\/\/archive.org\/stream\/grammarofscience00pearrich#page\/n9\/mode\/2up\">book<\/a>\u00a0about philosophy of science before he developed these tools in order to implement his philosophy. He sounds like an awful man&#8211;an active proponent of racism&#8211;but\u00a0that doesn&#8217;t invalidate his contributions to statistics. Their origin\u00a0in his philosophical thought\u00a0does, however, reinforce the point that latent factors need a\u00a0philosophical explanation.<\/p>\n<p>In very general terms, a latent variable is a number\u00a0derived from several direct observations (the manifest variables) and used to say something meaningful about the subject. A history test provides a\u00a0simple\u00a0example. The student&#8217;s answers to each question are manifest variables. The student&#8217;s grade\u00a0is derived from them, usually by just calculating the percentage of\u00a0correct\u00a0answers, and it is supposed to measure &#8220;knowledge of history,&#8221; which is latent. Only if the test is designed according to the best statistical principals is the overall grade\u00a0indeed a valid measure of\u00a0knowledge.<\/p>\n<p>The same example can be used to illustrate a more sophisticated tool, factor analysis. Suppose that any student&#8217;s chance of answering a given question on the history test can be predicted fairly well by a function of several measured variables (the student&#8217;s family income, the teacher&#8217;s background, the amount of time studying history, etc.) plus X, plus Y. X and Y correlate with the answers, but X and Y are not correlated with each other, and they remain constant for each student.<\/p>\n<p>That much might be a mathematical result: a function that roughly matches the actual data. The question then arises: what do X and Y mean? Suppose that X has a very strong correlation with\u00a0students&#8217; performance on questions that involve difficult\u00a0reading assignments, such as original source material. And Y has a very strong correlation with\u00a0students&#8217; performance on questions that involve concrete factual information, such as the dates of the Civil War. Assuming that X and Y are not correlated, we can conclude that history test scores involve two &#8220;factors&#8221;: reading ability and memorization of concrete factual information. That interpretation would likely be presented as a meaningful finding, with implications for how educators\u00a0should teach history.*<\/p>\n<p>I don&#8217;t disagree. I\u00a0am involved in this\u00a0kind of research myself (albeit usually contributing less than my fair share of the math). But what kind of a thing is &#8220;reading ability&#8221; or &#8220;memorization of concrete factual information&#8221; in this example?<\/p>\n<p>They are not\u00a0exactly <em>causes<\/em> of the students&#8217; actual answers to questions, for four\u00a0reasons.<\/p>\n<p>First, it is often (always?) possible to describe any given set of data with multiple functions.<\/p>\n<p>Second, given a mathematical function that well describes a given set of data&#8211;such as the\u00a0kids&#8217; specific answers to Mr. Brown&#8217;s AP history test&#8211;it doesn&#8217;t follow that the same factors\u00a0would also describe another set of data. The next 10 kids who took Mr. Brown&#8217;s\u00a0test might not fit the function at all. This is an example of the general <a href=\"https:\/\/en.wikipedia.org\/wiki\/Problem_of_induction\">problem of induction<\/a>.<\/p>\n<p>Third, we can often switch the direction of the explanatory\u00a0arrow. Instead of using the student&#8217;s latent ability in reading to explain or predict her answers to specific test questions, we could use her answers to those questions to explain or predict her\u00a0reading ability. If you can switch the direction of an\u00a0explanation, it doesn&#8217;t seem like a causal thesis.<\/p>\n<p>Finally, we don&#8217;t usually describe a &#8220;cause&#8221; as something that is derived mathematically from the effects. A\u00a0student&#8217;s family income might be postulated as a cause of her test scores&#8211;although it would\u00a0require an experiment to assess this hypothesis&#8211;but a variable\u00a0that is derived from the test data itself doesn&#8217;t seem to be a <em>cause<\/em> of it. Mulaik (p. 300) writes,\u00a0&#8220;causes generally are not strictly determinate from effects but rather must be distinct from what they explain.&#8221;<\/p>\n<p>If you are a strict inductive empiricist, in the tradition of David Hume, you don&#8217;t believe that anything is real except for direct observations. That means there are no causes. But it is possible to generalize based on what you have observed so far. Statistics is just a more refined toolkit for the kind of generalizations that we perform naturally when we observe, for instance, that kids tend to perform better on a test if they\u00a0study for it. This\u00a0is one way to make sense of a latent variable. It is a sophisticated version of ordinary induction. However,\u00a0pure inductivism has been criticized on numerous grounds.<\/p>\n<p>A different view is that some kind of mental process\u00a0or activity <em>causes<\/em> people\u00a0to\u00a0do things like score well on a given history test question. For instance, memorizing dates increases your odds of correctly answering questions on a history test. We can tell a causal story: the information enters\u00a0the brain, is stored, and is then\u00a0retrieved to answer the question. The latent variable that correlates with test scores\u00a0is an indication of this process. (But see Robert Epstein <a href=\"https:\/\/aeon.co\/essays\/your-brain-does-not-process-information-and-it-is-not-a-computer\">arguing in Aeon<\/a> against the storage metaphor for human memory.)<\/p>\n<p>In any case, the mathematics of factor analysis would not explain that this is what&#8217;s going on. It would only very roughly suggest a phenomenon that requires causal\u00a0explanation. And although it is fairly straightforward to infer a causal relationship in this case&#8211;you should study in order to do well\u00a0on a test&#8211;it is much less plausible that other factors are causal. For instance, do the Big Five Personality traits &#8220;cause&#8221; answers to concrete questions about emotions and behavior? \u00a0In 1939, Wilson and Worcester (quoted in Mulaik) asked, &#8220;Why should there be any particular significance psychologically to that vector of the mind which has the property that the sum of squares of the projections of a set of unit vectors (tests) along it be maximum?&#8221;<\/p>\n<p>Another level of challenge is that the data for any latent variable come from observations that someone has designed and selected. For instance, that history test could have included\u00a0entirely different questions. Or we could give tests on reading but not on history. The resulting factors would look different. Some conception of what&#8217;s important underlies the design of the test in the first place.<\/p>\n<p>This is what I&#8217;m inclined to propose: latent variables are numbers inferred from data. We give them names that refer to actual things that are very heterogeneous, metaphysically. Sometimes latent variables\u00a0suggest causal theories, although causation\u00a0requires other kinds of evidence to test. Sometimes they are descriptions of patterns in the accumulated data that are not causal at all. Sometimes they are just tools that are\u00a0useful for practical reasons&#8211;for instance, a\u00a0kid needs one grade in history instead of a whole bunch of numbers. Whether that grade is appropriate is partly a question of fairness, partly a question about what is valuable to learn, and partly a question of the pragmatic consequences (e.g. does this kind of test cause kids to learn well?). It is only partly a statistical question.<\/p>\n<p>&#8212;<\/p>\n<p>*The example I am informally describing here involves\u00a0exploratory factor analysis. You\u00a0identify\u00a0factors based on pure math and\u00a0name them based on a theory. On the other hand, in confirmatory factor analysis, you\u00a0hypothesize a relationship based on a theory and look for patterns\u00a0in the data that support or reject it. The math is somewhat different, as is the theoretical framework. I don&#8217;t want to go too deeply into that contrast because my topic here is broader than factor analysis. I am interested in uses of all latent variables.<\/p>\n<p>Sources: Bollen, Kenneth A. 2002. Latent Variables in Psychology and the Social Sciences. <em>Annual Review of Psychology<\/em>, vol. 53, 605-634;\u00a0Mulaik, Stanley A. &#8220;A brief history of the philosophical foundations of exploratory factor analysis.&#8221; <i>Multivariate Behavioral Research<\/i> 22.3 (1987): 267-305.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>(Washington, DC)\u00a0The search for &#8220;latent variables&#8221; is so common in psychology that I would almost call it definitive of the discipline today. Other disciplines also study people&#8217;s thoughts and actions, but the distinctive contribution of\u00a0psychology seems to be the use of variables that are not directly observed but rather inferred from data.\u00a0Latent variables\u00a0have been \u201cso [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-17118","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"acf":[],"_links":{"self":[{"href":"https:\/\/peterlevine.ws\/index.php?rest_route=\/wp\/v2\/posts\/17118","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/peterlevine.ws\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/peterlevine.ws\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/peterlevine.ws\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/peterlevine.ws\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=17118"}],"version-history":[{"count":14,"href":"https:\/\/peterlevine.ws\/index.php?rest_route=\/wp\/v2\/posts\/17118\/revisions"}],"predecessor-version":[{"id":17154,"href":"https:\/\/peterlevine.ws\/index.php?rest_route=\/wp\/v2\/posts\/17118\/revisions\/17154"}],"wp:attachment":[{"href":"https:\/\/peterlevine.ws\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=17118"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/peterlevine.ws\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=17118"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/peterlevine.ws\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=17118"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}