Democratic primary prediction: Clinton 3 – Sanders 3

As social media takes center stage in this year’s US presidential elections, we have performed multiple analyses on measuring the social media engagement levels as well as the impact of election debates on social media engagement.

Throughout the nomination process, Donald Trump dominated the discussion on social media and he has mastered the use of social media to drive engagement among his supporters, this has inarguably helped him clinch the republican nomination.

On the other hand, Clinton and Sanders used social media to a lesser extent than Trump, with Sanders having the higher engaged supporters of the two. Both Democratic candidates are making their last stand tonight in 6 primaries, which may very well conclude the Democratic nomination.

Let’s take a deep dive into the geographic composition of both Clinton and Sanders’ followers in the 6 states holding primary races tonight, namely New Jersey, Montana, South Dakota, North Dakota, New Mexico, and California.

In terms of follower count, Clinton has a commanding lead in the largest two states out of the six, more than doubling Sander’s followers. As stated in our previous analysis, this is to be expected given that Clinton was a well known political figure long before the current presidential cycle.

Clinton and Sanders California and New Jersey Followers

The situation is slightly different however in the smaller states, where Sanders closes the gap significantly.

Clinton-Sanders Followers By State

But how what about momentum?

Our previous analysis of the Indiana primaries told us that Sanders had significant momentum going into the primary and despite lagging in the overall follower count we predicted a high possibility of an upset based on that momentum (and he did). So in today’s analysis we measured the excitement level on Twitter over the past 24 hours looking for positive hashtags related to both candidates, and it appears that Sanders’s supporters continue to have higher excitement than Clinton supporters; however, this momentum appears to have slowed down from the time we ran our previous analysis; he leads Clinton with 58 to 42% (compared with 65% to 35% during the Indiana Primary). A good reason for this could be attributed to having multiple media outlets announcing last night that Clinton has secured enough delegates to potentially win the nomination at the convention. The activity timeline below gives a good picture; it seems that Clinton mentions peaked around 9pm on Monday right when mainstream media started making projections.

Clinton supporters Activity Timeline

Here is the breakdown of the engagement level by state:





























What is also worth noting is the gender gap among excited supporters, we see Clinton performing much better with women than men, while Sanders performs better with men than women.

Clinton Excitement by Gender Sanders Excitement by Gender


This is in contrast to their overall follower base where both accounts have similar follower gender distribution.

So what will happen at the six primaries tonight?

Social media excitement doesn’t correlate 100% to the overall excitement among supporters, but since we love algorithms we looked at historical data for excitement level on social media and actual election results and created a formula to correlate the social media excitement levels with the overall sentiment. Our algorithm tells us that Sanders and Clinton will split tonight’s contests winning 3 each, but Clinton will win the delegate rich states California and New Jersey extending her lead in the overall delegate count.

Here is what social media engagement data tells us by state:

  • Clinton will have a major win in New Jersey
  • Clinton will win California with a razor thin lead, but Sanders has a good chance of an upset here
  • Sanders will have a big win in Montana*
  • Sanders will pull an upset in New Mexico
  • Clinton and Sanders will split North and South Dakota*

*It is worth noting that Montana, South and North Dakota have smaller sample sizes, hence the data has a higher margin of error