That's it, the campaign has started. How do we know it? Take a look at your Twitter feed or a cable news channel. At some point, there is a good chance that you will see a familiar headline: "EU elections: a poll gives the National Front in front of Macron’s En Marche". Sometimes the headline will highlight two polls - that's one more poll after all.
These headlines were quite rare. But all good things come to an end - probably because the number of polls increases as the campaign progresses. That's exactly why we created PollsPosition, so instead of complaining, let's try to debunk the most common myths, and draw a portrait of the balance of power that we hope will be more accurate.
The current conventional wisdom: “LREM is in trouble in the polls. The RN is even in the lead in some of them!”
First observation: Macron’s LREM is not in trouble. Its median support is stable, at around 22.5%, with a 5-in-6 chance interval (83%) between 19.5% and 25%. Similarly, far-right RN is stable, with a 5 in 6 chance of getting between 18.5% and 24% of the votes and a median of 21,5%.
5-in-6-chance interval of the popular vote
Solid lines represent the median share of the popular vote of each party.
Shaded areas show the range in which the true popular vote is, with a 5 in 6 chance
(83%). So a hypothetical range from 20% to 25% with a 22.5% median means
the party has a 5-in-6 chance of getting 20% to 25% of the popular vote,
a median share of 22.5%.
Why take a 5 in 6 chance as benchmark? Look at it as the probability of getting any number but the 6 when throwing a fair die. Hover over the chart to see the details. You can hide/display a party by clicking on its name in the legend.
So arguing that En Marche suffered particularly last week, or conversely that the RN has progressed, is not factual and consists in cherry-picking: it is always possible to find isolated surveys that show the RN has progressed, but it is precisely related to what we often repeat : isolated points are subject to random variations, to which the average is less sensitive because the variations tend to compensate each other - one goes up, the other goes down. The average reduces statistical noise.
In other words, if the trend outlined by a particular poll does not hold true in the average, it is likely that this poll was an outlier rather than the first sign of a large pivot. Don't worry: if the majority of polls start showing an increasing RN, the average will reflect this, and the model will pick up on it.
But then, why is RN leading in some polls if it has not increased? For the same reasons, but I will answer it differently: since the model estimates that RN has a 3-in-10 chance of finishing first , it is not surprising to see this party in the lead in several polls - if I had to bet, I would say in 30% of them.
Distribution of the gap between RN and LREM
In fact, out of the 10 surveys released last week, the RN came out on top in... 3 of them - full disclosure: I did the calculation after having my 1/3 prior in mind; I must admit that I didn't think it would work as well! Check it out for yourself with the table at the bottom of the forecast page (Harris April 28 to Harris May 4). In short, the election is tight, so it is normal for the winner to change depending on the poll - the opposite would be surprising.
But if you read us regularly, you are not learning anything - that's what the model has been saying for the past two months : the election has been close between the two favorites since early March, and it still is today. In March, LREM and RN had respectively a 5-in-9 and a 3-in-10 chance of finishing first - exactly the same level as today.
If En Marche declines further, then Nathalie Loiseau may be worried. But in more than two weeks before the election she can also rise or, most likely, remain at the level she has been at for the past two months.
What are the other interesting fights?
After the sermon on averaging, here is the sermon on correlation: when you want to compare two parties, calculate the difference between the two, do not just compare their respective intervals or distributions.
For example, if you take the first graph of our forecasts , superimposing the distributions of far-left LFI and green EELV will not get you the difference between the two. If you want the difference, you have to compute it. Why? Because parties are correlated - with threshold effects, what affects one party does not have a linear impact on the others, as we wrote last time .
This effect is all the more noticeable for parties close to the 0-seat threshold:
Distribution of the gap between the Socialists (PS) and the green party (EELV)
It is clear here that the difference between the two parties is not at all the same as the distribution of each of them (see the first graph of our forecasts ). The lesson here is that the green party is now clearly favoured over the Socialists, while they were tied in November 2018.
The Green Party is now playing on an equal footing with Mélenchon’s LFI in what is the closest duel of this election:
Distribution of the gap between far-left LFI and EELV
In broad terms, the two parties have a 2-in-5 chance of finishing ahead of the other, and a 1-in-5 chance of being tied. If nothing changes, this election will mark a sharp drop for LFI (-11 points compared to the 2017 presidential election), whether it ends up ahead of EELV or not.
At the other end of the spectrum, you have the duels with the Republicans (LR), which are interesting for their certainty rather than their uncertainty. The right-wing party is virtually assured - according to our model and current data - that it will end up ahead of EELV and LFI, but behind RN and LREM.
Here is the most "uncertain" duel, but you will find the other three on the forecast page :
Distribution of the gap between RN and LR
LR therefore has a 7% chance of doing better or as much as the RN. This remains unlikely but we are no longer in the realm of the negligible. We approach a 9-in-10 chance, which allows us to imagine scenarios where, for various reasons, polls are seriously mistaken in favor of LR.
Out of the 800 polls and 15 elections in our database, pollsters were mistaken by an average of 1.5 point (in one way or the other), all parties combined. They also overestimated the far right by an average of 2 points - an error in the direction needed by LR - this trend being strongly influenced by the last five elections.
Of course, since the median gap between the two parties is 5 points, all planets would have to be aligned in favor of LR for these scenarios to occur - which is why the model indicates a low probability but clearly greater than 0. In comparison, the 6+ points separating the median support for LREM and LR seem to require a much higher than usual polling error. Hence the 3% chance of doing better or as much as En Marche - still not impossible, but very unlikely.
Focusing on ties throws away information
You may have noticed the grey bar in the duel graphs. It represents the probability that the two parties under consideration will obtain the same number of seats. The higher it is, the more likely it is. This seems useful to us because each duel is in fact a "truel" - party 1 wins, party 2 wins, both parties are equal - so we must not forget this scenario, which is generally quite likely.
But there is the other side of the coin: focusing only on this case; noticing only the polls that show that LREM and RN are tied, for example - like you see more red cars on the street when you actively look for them. In doing so, much of the available information is discarded.
Let's take the LFI-EELV duel for example, the tightest duel and where the "tied" scenario is the most likely with nearly a 1-in-5 chance. Precisely, by focusing on this chance out of 5, we forget what happens 4 times out of 5 - we throw away 80% of available information.
That's why we try not to focus on one very specific scenario, and every time we see analyses focusing on a poll where two parties are tied, we ask ourselves: why focus on this particular survey? Do the other polls also show these parties in a tie? What confirmation biases can lead the author to focus on this scenario?
Do polls and the model take turnout into account?
We often get this question. For polls, it really depends on the pollster and I will be careful not to speak on their behalf. In broad terms, pollsters ask a question to registered voters to distinguish between those who will vote and those who won’t - this question changes with the pollster. Then they ask those who think they have a chance to vote - even a small chance - for their voting intention. So, voting intentions do take turnout into account - although to varying degrees. This seems logical: it is difficult to imagine pollsters asking people who are sure they will not vote for their voting intention.
So I think what we're really being asked is: "Do polls take into account the uncertainty implied by turnout? ». Then the answer is no: polls report averages, and the margins of error that accompany them only represent sampling error - the one due to the fact that we survey only a part of the population - and not the other sources of error.
To quantify uncertainty, you need a model. The methods we use are precisely designed for that purpose. For that matter, the error the model simulates is not conceptually restricted to turnout: it can be due a methodological error of pollsters, a media event on the eve of the election, a terrorist attack, etc. But the result is always the same: these variables generate uncertainty.
This leads me to the conclusion: dozens of variables other than turnout influence the results of an election, so beware of the "red car effect" (if you focus too much on it, you may forget the other variables). Especially since we can doubt the black swan effect of turnout: historically, it is rarely very far from one election to another - it is much more likely to decrease from 60 to 55 than from 60 to 20 for example. There are also floor and ceiling effects - voters who always/never vote in any election. All of this limits turnout’s disruptive potential.
Finally, turnout may not be the most scientifically interesting variable. It does not explain why some voters do not vote: it is an aggregate of other variables that potentially explain the absence of voting - weather on Election Day, lack of interest in public life, difficulties with voter registration, misunderstanding of the election, lack of representativeness of ideas... These may be more interesting to focus on.
Do you have more questions about the model or the campaign? Send them to us , we'll be delighted to answer them!