Why have some states seen many more deaths from COVID-19 than others? Do differences in state policies matter? Is it mostly about demographics? Or what about factors like climate and population density, which could influence whether and when people congregate indoors?
To explore these questions, I made a spreadsheet with 58 salient variables about the 50 states, drawing most of the data from the Senate Joint Economic Committee or the Kaiser Family Foundation. I then went fishing for variables that could predict cumulative death rates from COVID-19. I use this “fishing” metaphor with irony, because there is a danger of obtaining spurious results when you explore too many variables at once. Still, the following results might suggest tighter research questions.
Below, I describe nine regression (OLS) models, each with a different thematic focus, arranged in order by how much variance in the states’ COVID-19 mortality they seem to explain. (I report adjusted r-square statistics, which should allow the models to be compared despite differences in the number of variables.)
In summary: the states’ policies that I measured and the partisanship of governors did not matter, but the proportion of people who voted for Trump did. That relationship was not explained by demographics, which I controlled for.
Variables that mattered in many of my models included the percentage of the population that was already in poor health, the GOP vote share in 2020, Black/White residential segregation, and the GINI coefficient (a measure of inequality). A model with just those four components could explain 71% of the variance in COVID deaths (unadjusted r-square = .715).
- A politics and policy model. Variables: party of state governor, percent of the 2020 state’s popular vote for Republicans, whether the state required masks indoors for some people in Feb 2022, whether the state required, allowed, or banned local vaccine requirements, and state/local spending per capita. The only statistically significant correlate of the mortality rate: the GOP vote share in 2020. Adjusted r-square = .203, meaning that this model offers little insight.
- A geography model. Variables: population density, percentage rural, average commuting time, mean daily temperature. Statistically significant correlates: none. Adjusted r-square = .240 (again, a poor fit).
- Sociability model: Variables: average number of close friends, percent of neighbors who regularly do favors, number of nonprofits per 1,000 people, percentage who worked with neighbors to fix/improve something. Statistically significant correlate: working with neighbors (related to lower mortality). Adjusted r-square = .415.
- A comorbidities model: Variables (all measured pre-pandemic): percent in poor health, premature mortality rate, mortality from suicide/drug overdose, percent disabled, percent with diabetes, obese, and smokers. Statistically significant correlates: general poor health and disabilities. Adjusted r-square = .451.
- A political participation model: Variables: percent who participated in a demonstration, attended a public meeting, served on a committee, and voted in 2012 and 2016. Statistically significant correlate: attending a public meeting (related to lower mortality). Adjusted r-square = .483.
- An economics model. Variables: unemployment, incarceration, poverty, GINI coefficient, college graduation rate, internet access at home. Statistically significant correlates: worse inequality, higher incarceration, fewer people with BAs. Adjusted r-square = .623.
- An inequality model: Variables: Black/White residential segregation, GINI coefficient, college graduation rate, incarceration rate. Statistically significant correlates: racial segregation, GINI coefficient. Adjusted r-square: .646.
- A politics and demographics model. Variables: the party of state governor, percent of the 2020 state vote for Trump, and the racial demographics and median age of the state. Statistically significant correlates: higher GOP vote, more African Americans, more Latinos, a higher median age. Adjusted r-square = .647.
- A model that explains most of the variance. Variables: percent in poor health before the pandemic, GOP vote share, Black/White segregation, GINI coefficient, percent over age 65, incarceration rate, college graduation rate. Statistically significant correlates: the first three. Adjusted r-square = .699. (Unadjusted r-square = .735.)
My dataset also included some variables that I have not mentioned here, including several measures of trust (for other people and for institutions) and other types of civic and political participation. None seemed to be influential in any of the models I tried.