The 2016 Election: Analyzing a Survey of 64k Americans.

Summary: Why did Clinton lose? This analysis of CCES data suggests a combination of anti-immigration & racially conservative Democrats defecting to Trump, an inability to maintain 2012 turnout levels among black voters, and tepid support from (younger) Bernie voters left her with a coalition too small to win the electoral college.

Intro: The 2016 Cooperative Congressional Election Study is a national survey of ~64,600 adults. The following is broken into 1. a summary of a regression analysis on vote choice between Clinton and Trump and 2. plots visualizing differences among Clinton/Trump voters, Vote Switchers (e.g. Obama to Trump voters), and various subgroups like gender and age.

As for variable definitions: vote choice refers to stated vote choice and race is broken into hispanic, non-hispanic white, non-hispanic black etc.  Additionally, many of the issue questions ask similar questions on the same topic (e.g. abortion, racial attitudes etc.). For ease of comparisons I created a summary variable for these questions. In essence, I centered each question to mean 0 (i.e. converted to z-scores) and averaged the answers to get a single measure of the questions. For example, a score of 1 on the prochoice variable refers to being on average 1 standard deviation more prochoice on the set of abortion questions. Lastly, all responses have been appropriately weighted.

Regression Analysis

Analysis of the 2016 CCES is broadly consistent with the findings in from my county-level analysis as well as previous research. I obtain primary results by running logits on individual responses with state level fixed effects and survey weights, along with various robustness checks. I focus on the role racial attitudes and support for immigration has on vote choice which, for example, adds as much as 31% explained variance in vote choice to a demographics only model. Rendered code can be found here.

A notable distinction is that race and education are much more predictive at the county level than at the individual level. This suggests that living in areas with lots of college graduates or lots of racial and ethnic minorities is more important in predicting vote preferences than individual-level demographics — thus, consistent with the theory that right wing populism is driven in large share by those living in socially & culturally isolated or segregated areas. Furthermore, this interpenetration is reinforced by the expected behavior that education loses nearly all its predictive power when controlling for racial attitudes (see: DeSante & Smith 2017 for relevance of race attitude questions).


1. 2016 Stated Two-Way Vote: by 2012 vote choice, 2016 Obama approval, stated primary vote, ideology, age, and ideology*age.

Note “NA” includes those who stated they did not vote or voted for a third party. Also, net votes refers to how many more votes than Trump did Clinton receive among a particular group.
net_votes_ageideo5_by_age2. Those who voted for Obama in 2012 but voted for neither Clinton nor Trump in 2016. This group  predominately has a somewhat to strongly favorable opinion of Obama. Demographically they are disproportionately black and younger than 30. They are more likely to have not voted in the primary or to have voted for Bernie Sanders (In fact only ~75% of Bernie voters stated they voted for Clinton in the general election).  While ideologically they primarily identify as moderates and somewhat liberal, their stances on policy are virtually indistinguishable from other Democrats.

In short, their lack of regular voting and not their policy positions is what distinguishes this group from Obama-Clinton voters.o2na_by_o

3. Those who voted for Obama and then Trump. The main message is that while Trump won over many people who voted for Obama, these voters had already soured on Obama in terms of favorability and differ significantly on policy from other Obama voters. They are primarily centrist on issues while moderate to somewhat conservative in identity. The most distinguishing feature is that they are most conservative on race and immigration issues. In many ways they are the last of the Reagan Democrats or Blue Dogs: racial conservatives with more centrist positions on economics, but who are not themselves economically disadvantaged.

4.  Those who voted for Romney and then Clinton. This group is much smaller than the others. However, I’ve included two relevant plots suggesting Romney to Clinton votesr are moderate on ideological identity, center to center-left on policy Qs, and were disproportionately likely to have voted for a non-Trump or non-Cruz Republican during the primary.


6. Legislative Questions: df refers all respondents, df2 to Obama to Trump voters, df3 to Obama to Neither. 1 corresponds to 100% support for proposed legislation, 0 with 0% support. Interestingly, support for Highway funding is the only question which has virtually no association with vote choice among Clinton and Trump supporters.



Subgroup Plots:

  1. Marriage & Gender. One talking point which has emerged since the election is the observation that “more white women voted for Trump than Clinton”. However, this ignores splits by education and marriage. Here it’s evident that white women without a college degree and who’ve been married at least once are the only group of women that were more pro-Trump than pro-Clinton. For men, nonwhite men across the board and white men with a degree who have never been married are more supportive of Clinton than Trump.However at the individual level , marital status appears to be associated with vote preferences only indirectly as it’s indicative of age and living in a non-urban setting. Nonetheless, a county’s percentage of residents who’ve never been married is one of the best single variable predictors of either 2012 and 2016 results (correlation coefficient of .7) .
  2. New Voters. Here are respondents who stated they voted for one of the top two candidates in the 2016 election but not the 2012 election. Recall the first plot that showed Clinton and Trump pulling virtually the same number of voters from this group. It’s apparent that Clinton netted more new voters among college-aged whites and nonwhites regardless of age while Trump’s new voters were more likely to be middle age and were almost exclusively white. These differences support the narrative that Trump was particularly attractive to irregular “working class” whites concentrated in the Midwest.
The 2016 Election: A County-Level Analysis

What explains the 2016 presidential results at the county level? Running OLS and Spatial regressions on a dataset constructed from roughly 250 potential covariates, I find that support for Trump is strongly consistent with the burgeoning research on right-wing populism and is strongly inconsistent with the thesis that Trump’s electoral successes were driven by economic hardship. Furthermore, a follow-up analysis of the 2016 CCES (survey of 64k Americans) yields strongly corroborative results.

Right wing populism can be explained as cultural backlash to perceived out group threats (ex. contemporary demographic change in North American and Europe), especially among persons with authoritarian prone personalities and ethnocentric beliefs. Notably, Inglehart and Norris provide this summary:

[C]ultural values, combined with several social and demographic factors, provide the most consistent and parsimonious explanation for voting support for populist parties; their contemporary popularity in Europe is largely due to ideological appeals to traditional values which are concentrated among the older generation, men, the religious, ethnic majorities, and less educated sectors of society…. [In contrast,] evidence for the economic insecurity thesis, the results of the empirical analysis are mixed and inconsistent…. Populism, [a loose set of ideas that share three core features: anti-establishment, authoritarianism, and nativism] is a standard way of referring to this syndrome, emphasizing its allegedly broad roots in ordinary people; it might equally well be described as xenophobic authoritarianism.

This is to say, culture (specifically views on immigration, race, and gender) as opposed to economics appears to be the dominant cleavage in explaining Trump’s electoral successes. The following analysis is consistent with view. Left for further investigation is the role which Chinese trade competition and sociological isolation may have played in causing support for Trump.

Section I looks specifically at support during the general and primary for former Secretary of State Hillary Clinton and businessman Donald Trump. As described above, results are consistent with “cultural backlash” theories of support. Section II looks at third party support. Specifically, results are consistent with the concern that support for Senator Bernie Sanders led to greater third party defection. Section III looks briefly at turnout and ticket splitting. In particular, turnout among African American areas was decisively lower than 2012 while the sharp turnout decline in Wisconsin is suggestive of a negative effect from the 2016 voter identification law.

Regression tables and rendered code can be found: OLS and Spatial.

I Two Party Vote Margin & Primary Shares

Unsurprisingly, race and education are the most dominant explanatory variables. As is true with the 2012 and 2004 elections: after controlling for race and education, lower quality of life (poverty, drug mortality, unemployment, lack of health insurance, obesity, single parenthood, population stagnation) is associated with voting Democratic. While household income is negatively correlated with Clinton’s vote margin, it becomes insignificant as broader quality of life measures are included. As expected for candidates from incumbent parties, Clinton performed worse in counties with higher growth in unemployment or uninsurance rates. However, the effect sizes from these growth variables are too small to have affected the election outcome and may have even been negative when considering effects from the static QoL variables. In the base models, heavy alcohol consumption rates positively correlate with Clinton’s vote margin, but this is more likely an indicator of local culture and norms than economic hardship. Moreover, alcohol use is insignificant in the model which corrects for geographic dependence. Similarly, while injury mortality (a proxy for gun ownership and culture) is negatively correlated with Clinton, the effect size and significance are greatly mitigated by correcting for geographic dependence. Variables indicative of increased ethnic tension are predictive of Trump’s performance. Specifically, racial diversity, religious diversity, income inequality, foreign born share, and growth in a the percentage of non-whites are all associated with Trump’s electoral performance. Furthermore in the spatial model, the county proportion of African-Americans is associated with lower Clinton performance in bordering counties – tentatively suggestive of partisan self-sorting or “white flight”.

Primary election covariates for Trump are similar to the general election. In particular and more so then the general, Trump performed well in Catholic/Orthodox communities as well as older communities. Like the general election, Trump’s primary share is associated with smoking, lack of exercise, injury mortality and lower population density. These factors in conjunction with education greatly mitigate any negative relationship between Trump’s electoral successes with life expectancy. Curiously, unemployment is associated with Trump’s primary share (unlike the general) while other markers of hardship have either a zero or negative relationship (like the general). Finally, Trump does better in areas with a greater share of Caucasians as well as higher growth in the proportion of the non-white population.

II Third Party Voting.

Clinton’s primary share is negatively correlated with Stein’s vote share after controlling for observables. This is consistent with the concern that support for Bernie Sanders led to greater third party defection.

It is unlikely that Clinton’s performance in the primaries is acting as a proxy for county level ideology. Individual polling found only modest differences between Clinton’s and Sanders’s supporters. Additionally, at the county level, Clinton’s primary share is positively correlated with her two party margin in the general. Moreover, covariates such as exercise, alcohol consumption, trump’s primary share, and geography would mitigate the influence of ideology if it were a confounding variable. Lastly, outside the consideration of statistical models, the primary driving third party defection is consistent with the tens of thousands of write-in votes cast for Sanders.

III Turnout

Preliminary analysis of county level turnout is consistent with the general finding that black turnout subsided from 2012, and is suggestive that Asian areas saw increased turnout. Unsurprisingly, turnout was lower in Mormon areas. Turnout in counties with high alcohol use was also down, a variable which is highly geographically dependent and speculatively could be related to union membership rates or some other factor related to the Great Lakes and North Great Plains regions.

It’s been observed that Wisconsin’s turnout was abnormally low compared to earlier years. Even after controlling for state fixed effects, the counties of Racine and Milwaukee stood out in particular as outliers on both turnout and margin. While not definitive, it does strongly motivate the question how and why did WI’s strict voter identification law affect turnout. In the past, support for the effect of ID laws has been mixed. Thus, it is worth examining what makes WI’s ID law different and how much of any effect is temporary.

As for ticket splitting, Clinton performed better than her Senate counterparts in ethnic minority and Protestant areas, and counties with higher population density. She performed worse in areas with higher rates of intermediate educational attainment and higher rates of disability. These findings motivate the question on how best to leverage supporters both up and down ballot.

Ranked Choice Voting: A Story of Seven Historical Elections.

This post is broken into a introduction on ranked choice voting followed by seeking to answer how ranked choice voting could have changed past U.S. elections. Feel free to skip the introduction if you’re already familiar with the topic.

Introduction to Ranked Choice Voting:

For those unaware ranked choice voting (also known as instant runoff voting (IRV) or the alternative vote (AV)) is a simple to understand method to hold elections in place of plurality voting (aka first past the post). It involves ranking the candidates  in order of preference. Then when the election is counted the candidate with the fewest votes is dropped and those votes go to the those voters next choice  until one of the candidates have 50+% of votes. It’s akin to automatically running a series of of runoff votes until one candidate has 50%. The reason this method is seen as better than simply electing than the U.S. and U.K. method of having only one round, is that it means no one wins without support of a majority of voters. Moreover, voting for a party outside the main two will not spoil an election. So it lends itself more readily to multiple parties which are more likely to collaborate with one another.  See more information via Fairvote.

Actual Post:

Now that everyone understands ranked choice voting, I want to pose the question what U.S. elections that have already occurred.


Of the soon to be 55 competitive U.S. presidential elections, 18 (very possibly 19 come November 9th) times a candidate won the election without securing a popular vote majority. It is thus tempting to ask which of those elections would have changed under a different voting system (namely directly electing the president with ranked choice voting). Of these 18, I estimate 6 of these elections selected a different winner than the majority preferred candidate (1844, 1876, 1880, 1888, 1912 and 2000). 4 of these six times in part because of a spoiler effect (the presence of third parties allowing candidates to win with less than 50% support) and 2 times because of the Electoral College (1876 and 2000). Note, if these elections were instead decided by plurality vote only 1876, 1888 and 2000 would have changed. Additionally, the four way election of  1824 would have selected the plurality winner but not the majority winner.


So how did I make these estimates? While we have only results of the first choice ballots, it’s possible to make inferences based on historical information on how these minor parties behaved through party actions, candidate statements, prior behavior among supporters and platform similarities.  Naturally, we can’t know what would have changed had these elections explicitly been conducted via ranked choice voting, but we can at least make a certis paribus estimate of what would have happened.


Twice did the United States fail to elect the candidate who won the majority of the popular vote. This is the election of 2000 where Al Gore failed to best George Bush despite winning 51.3% of the vote. Additionally there is the election of 1876 where Republican Hayes won by one electoral vote despite the (Anti-Reconstruction) Democrat Samuel Tilden winning 50.9% of the vote.

This leaves us with four elections where I believe would have changed with ranked choice voting. In 1844, the newly minted Liberty Party pealed enough anit-slavery northerners from voting for Whig Henry Clay that Democrat James Polk won with 49.54% of the vote.

From 1880 to 1892, several minor parties existed. These parties mostly consisted of non-Southern populists affiliated the Democrats but broke away because of disputes over soft money policies (the Wizard of Oz debate between those who wanted a weaker currency (silver backed and greenback/fiat) and those who wanted stronger currency (gold backed). For example the Greenback party commonly and consistently ran fusion tickets with the Democrats. The Union-Labor party ran Democrat affiliated James Weaver for president. Therefore, I believe it’s reasonable to assume the majority of supporters of these populist parties (i.e. Greenback, Union-Labor and Populist parties) would have ranked the Democratic candidates as their second choices. They often ran on the same ticket! And so the Greenbacks (3.32%) prevented Winfield Scott Hancock (48.25%) from gaining more voters than James Garfield (48.27%) in 1880. Next in 1888, the Union Labor party (1.31%) could have put  Grover Cleveland (48.63%) over  Republican Benjamin Harrison (47.80%) in popular voters. However, this would not have changed the electoral results as only in Indiana did the Democrat lose by a smaller margin than the 3rd party vote share.

Fourth and last, in 1912 former President Roosevelt ran as a 3rd party against Republican Taft and Democrat Wilson. Roosevelt could have easily overtaken Wilson’s 41.84% vote share both in popular and electoral voters with only the help of Taft’s supporters let alone the minor parties.

There are other very close elections like 1960’s defeat of Nixon or 1884’s first win by Grover Cleveland. However, my best estimates point to that the actual winner would have also narrowly won the in a ranked choice voting format as well. Finally, there is the election of 1824 where Jackson won the plurality of popular and electoral voters but John Quincy Adams was selected by the House. It seems like had Quincy Adams ran instead of the other candidates like Clay, Adams would have defeated Jackson. And so the House incidentally choose the majority if not plurality preferred candidate (an outcome that is unlikely to repeat itself).


I wanted to highlight these historical elections to show where ranked choice voting could have changed the Presidential outcome. Furthermore, it is because of a combination of the Electoral College and presence of third parties that a probable six times in U.S. history the candidate without majority support became president. Additionally, the Electoral College does not favor either Republicans or Democrats for more than a few elections at a time. That its third parties regardless of ideology (Libertarians and Greens) are capable of  spoiling an election. It was simply Al Gore’s narrow electoral loss which gives too many the false impression that the Electoral College intrinsically favors the modern Republican Party. Lastly, that ranked choice winners historically do not mean better presidents (see 1876!) — only I would argue presidents selected by fairer and more democratic elections. Hence if there is a call to action from this post, it’s for electoral reform because it does have practical benefits.

A Workable Economic Strategy

When presented with the choice of voting for Democratic candidates or staying home, working class, young and non-white Americans disproportionately choose the latter. Part of what drives this voting gap (detailed here) is the relatively difficulty of voting in the United States. However, part of it is a sense of disempowerment and distrust of what Democratic candidates offer. Therefore, the question I want to present is what is an economic platform that could actually raise turnout rates among these left leaning groups.

While Democrats are generally good at proposing policy supported by research, their platform has clearly failed to energize voters. Of course, many proposals that would help the working class never become law. There is also ample room to improve on the effectiveness of Democratic policies. However, beside the unhelpful advice that Democrats and progressives should simply win more elections, there are concrete changes to be made.  In particular:

1. Labor: prioritize workers and wage growth.

2. Student Debt: address the financial anxieties of Millennials.

3. Rural Poverty: run socially moderate candidates in non-urban districts.

4. A Better Narrative: it’s always the economy.

Prioritize Workers

Labor should not be treated as a special interest whose only role is to run GOTV programs in exchange for the status quo of labor law. Yes, Democrats seemingly offer several good policies for workers from increasing wages (EITC, minimum wage, etc.) to increased job training to new labor standards (card check, overtime, sick leave etc.). However, when it comes to spending political capital to achieve these goals, Democrats have generally been hesitant. There should exist a national blueprint weighed by evidence to improve labor outcomes for all workers and to be vigorously pursued at both the state and federal level.

Confront Student Debt

It’s no secret that younger voters tend to be more economically left-leaning, less likely to vote and more worried by the cost of higher education and youth unemployment than the general public. The Sanders campaign has been able to shift the Democratic Party and the Clinton campaign to the left on higher education and  embrace expanding tuition free college and promoting student debt relief. However, Democrats down ticket and in off year elections must also run on making college affordable, creating career/educational paths for those without a four-year degree, and promoting debt relief for the financially strained. In short, dedicated and clearly communicated plans to address the financial anxieties widespread among Millennials will help boost turnout.

Address Rural Poverty

There is a massive partisan divide between urban and rural voters in the United States. However rural poverty still exists and its not as though Republican policies are helping the rural poor. While it may be hopeless to ask voters to elect socially liberal candidates in Appalachia (more generally areas that are predominately white and low-income), Democrats should willing to run socially moderate yet economically left leaning candidates. Any effective party in the United States cannot rely solely on urban voters.

Create a Better Narrative (it’s always the economy)

Most Americans are not experts on economic policy and could hardly care less about the never ending sea of 12-point plans. This is precisely why I’ve empathized issues in this post rather than policies. Progressives need a narrative that makes focal the concerns and anxieties of the working class and makes believable its commitment to grow wages and employment opportunities for all Americans. An example:

40 years of economic gains have disproportionately gone to the top. Many suffer because a lack of suitable employment, adequate health care and access to education or paths of economic improvement. By investing in health, education and infrastructure we can improve society at large. When workers, students and patients are left to fend for themselves, they are taken advantaged of by their lack of bargaining power. By revitalizing collective bargaining and economic solidarity, many of the disaffected can finally claim their fair share of the gains of growth. Workers have a right to organize and they have a right to a decent living. Hence, we must be a party of workers’ rights. This means a higher minimum wage. This means tax relief for the working poor and working parents across the country. This means overtime and paid paternity leave. This means the right to collectively bargain and more opportunities for job training and education. Simply put, it’s an economy that grows for everyone and not merely the powerful. We must be a party that stands by working Americans and that gives voice to the voiceless and power to the powerless.


In essence, what is needed is not a major overall of Democratic economic policy. Rather it is giving priority to improving the labor market for both high and low skill workers. It’s the ability to build not just social coalitions but also economic ones when fielding candidates, especially if Democrats ever want to win the House. Yes, there are issues such as global warming and tax reform where bolder policies could improve policy debates.

However, it is the lack of clear communication and legislative priority towards the  disaffected which ultimately drives the voter gap. Any economic strategy for a more inclusive economy most work in tandem with an electoral strategy for a more inclusive democracy.

2016: Rethinking Favorability

Let’s start with a basic claim about the 2016 presidential election: Both Donald Trump and Hillary Clinton have historically low favorability ratings for any presidential nominee since modern polling. If you’re unfamiliar here is a 538 article on the topic.

Does this provide any useful information about the fall? Probably not.

Morning Consult has provided estimates of approval ratings for both candidates by state. Based on those numbers, here is a map of where Clinton or Trump has a higher net favorability rating:


Note, only South Carolina flips if we restrict ourselves to favorability rather than net favorability.

The more fundamental question, does anyone actually believe this map accurately reflects current voting preferences? Simply compare the map above with what Sam Wang’s poll aggregator calculates:


So what is going on? We would hope that a candidate who has a higher net favorability rating would defeat one who has a lower rating. Most of the time yes, but not always.

The problem is that we are making an implicit assumption about the preferences of people who don’t strictly favor one candidate over the other. Moreover, that those who are unsure about both candidates, like both candidates or dislike both will split their votes evenly. This is often unreasonable.

It’s likely that evangelical voters in South Carolina and Georgia have a unfavorable view of both candidates. However,  it’s unlikely that these former Ted Cruz voters will support Clinton at the same rate they support Trump. The reverse for the left-leaning independents of New Hampshire and Iowa who voted for Sanders. Hence, Clinton being more well liked than Trump in Georgia but not New Hampshire. We’re not fully capturing the preferences of people who dislike both candidates.

This is similar to the more common problem that candidates like Elizabeth Warren or Jeff Sessions might face where they poll better than more well known candidates who are actually running for office. Both Senators are popular among voters who are already predisposed to like them, while those who have never heard of them are likely to dislike them. This is an early polling challenge that faces any candidate who is well known by the party faithful but not the general public.

While net favoraiblity is a measure that should positively correlate with voting preferences,  it’s not strict.

In short, when it’s not possible to examine horse race polls, favorability ratings can offer an decent estimate. However, they don’t add any new informative on top of simply asking people how they will vote.

