close
close

first Drop

Com TW NOw News 2024

news

Are GOP-Leaning Pollsters Biasing The Averages? (No.) – Split Ticket

One question we keep getting is whether Republican-leaning polls are biasing our poll averages. The criticism goes as such: since the averages are only as good as the data that goes into it, a flood of polls from GOP-leaning pollsters would meaningfully bias the polling outlook rightward. This is what many call “flooding the zone”.

In reality, this is not why polls and models show a close race. Our averages control very heavily for quality, and we also adjust and down-weight partisan-affiliated polls to account for traditional bias. Under any lens, this election looks really close right now, and the nonpartisan polls simply do not look any better for Democrats than the full average does.

For proof, the chart below shows the Split Ticket polling averages today, as well as what they’d be like if we used only nonpartisan, high-quality polls (with a rating of at least 2.0/3.0 stars on FiveThirtyEight). In every single case, this would leave the aggregates unchanged or move them to the right.

The “red interference” narrative simply does not hold up in 2024 — good polling aggregators exert strong controls for both quality and partisanship, and so they were never truly impacted by firms like Patriot Polling and McLaughlin in the first place.

A Harris overperformance is very possible. Our model gives her roughly a 1-in-4 chance of just sweeping all of the battleground states. But if that happens, it won’t be because of the “right wing polls”. It’d just be because the industry, as a whole, underestimated her support in this scenario. It wouldn’t have to do with anyone “flooding the zone”.

The reason everyone is examining this angle again seems to be rooted in 2022, where some awful punditry and analysis, paired with some terrible polls, created a false narrative of impending Republican hegemony. This lens pins the analytical failures of an industry on the “red wave polls” that duped an entire cohort of media pundits and forecasters, who were expecting a Democratic wipeout that never came to pass.

But it’s worth looking a bit more into what really happened that year, and what we should learn from it for 2024.

What Happened With 2022 Polls?

In mid-October of 2022, Democrats went into the midterms looking surprisingly strong in the Senate, polling well ahead of Republicans in a majority of the chamber. But many Democratic leads collapsed in the final weeks of the campaign, even narrowly falling behind in polls of Georgia and Pennsylvania. Republicans crowed and Democrats panicked, as the “red wave” had finally come.

Pundits were ready even before the collapse. Pre-mortems were already written about Democrats “peaking too early”, and insiders braced for Republicans to sweep. Hindered by inflation and an unpopular president, Democrats were destined for disaster at the polls, having unwisely focused on election denial and abortion. At least, that was the narrative the pundits foretold.

But then the results came in, and the pundits went into shock. In one state after another, Senate Democrats outran their polls. In many, they performed similarly to the polls before the collapse.

Poll watchers already had a culprit in mind. The final weeks saw polls flood in from GOP-aligned firms, with methodologies seemingly designed to show redder results. Thus, Democrats accused these pollsters of “flooding the zone”, skewing poll averages redder and causing them to miss. They also pilloried poll aggregators for letting this occur, particularly in races like Pennsylvania Senate, where it caused averages to wrongly show Republicans ahead.

The line was consistent, the culprit was clear, and the fallacy was obvious, in everyone’s eyes. “If Republican pollsters didn’t flood the zone, we all would’ve done just fine.”

Everything above is the narrative as recounted, but the bolded assertion seems to be what some folks took away. So as “flooding the zone” returns to the discourse, and polls remain too close for some, an easy target and correction is identified to solve both problems at once.

As usual, the truth is more nuanced than what’s recounted, making both bogeyman and scapegoat at once. There’s a real problem with what happened in 2022, and the true culprit may be all of us.

REEXAMINING 2022

To start, let’s examine what the 2022 poll averages actually said. The two main poll aggregators are FiveThirtyEight and RealClearPolitics, with FiveThirtyEight using a time-and-quality-adjusted average, and RealClearPolitics using a simple average of some of the most recent (and typically the most Republican-favoring) polls. One prominent narrative is that before GOP-aligned pollsters sharply moved their numbers rightward, the Senate poll averages were largely accurate. However, the truth is quite a bit more mixed.

First, it’s unquestionably true that polls swung rightward in the final weeks, by ~3.5% on average. Both FiveThirtyEight’s and RealClearPolitics’s averages swung right by the same amount, with RealClearPolitics’s averages being a bit redder overall.

Second, it’s also true that the earlier Senate polls were more accurate than the final ones. However, it wasn’t by that much — FiveThirtyEight’s final Senate averages were only 0.5% less accurate, while RealClearPolitics’s were 1.8% less. This is because some averages started off relatively accurate and swung redder (Arizona, New Hampshire, Pennsylvania, and Washington), while others started off too blue, and became more accurate after the red-shift (Florida, Iowa, and Ohio).

Of course, the reason people bring this up is because they claim “flooding the zone” caused the late shift right. As such, if GOP-aligned pollsters didn’t flood the zone, there would be no rightward shift, and thus the polls would’ve been more accurate. So it’s worth examining the claim: would the polls in the last two weeks of the 2022 election have been more accurate if we only had nonpartisan polls?

Below are Senate poll averages over the last two weeks of 2022: one column is the FiveThirtyEight average of all polls in the state, and the other is a flat average of all nonpartisan polls. Interestingly, the nonpartisan polls were generally more accurate, and the partisan polls did shift the averages redder in several key races. But it was neither to the degree anyone seems to remember — the maximum impact topped at 2.4% — nor did it bias the averages overall, with a statistically insignificant average impact of R+0.3%.

So we can conclude that the poll averages of some key races, like New Hampshire and Pennsylvania, did move rightward as a result of partisan polls. But even the nonpartisan pollsters undershot Democrats in many key Senate races, and blaming partisan pollsters obscures the fact that there was a small polling miss in the industry overall that overestimated Republicans in the Senate.

Yet despite all of that, it’s important to stress that these polls were still quite good. The average error across chambers was significantly lower than that of most recent cycles, and polls fared way better than they did in the disastrous 2020 cycle. If you consulted only this data, the Democratic overperformance would not have been a surprise — no matter which set of polls you consulted, you would have known that a 51-seat Democratic Senate majority was an extremely plausible outcome.

The only state with a divergence in outcome between the nonpartisan polling average and FiveThirtyEight’s was Pennsylvania, where the impact of partisan polls was real and fairly noticeable — the nonpartisan average saw Fetterman lead by 1.9, while the FiveThirtyEight average had Oz up 0.5. But this, on its own, doesn’t explain away the red wave narrative that enveloped the media, primed pundits for large GOP congressional majorities, and saw Democratic donors abandon Mandela Barnes in Wisconsin (where the nonpartisan aggregate was actually worse than FiveThirtyEight’s).

What, then, was responsible for this narrative? Here, it’s worth looking at the most critical component of analysis: just as important as the data is the way in which it is presented. And it’s clear that this is where everyone’s collective intellect was nowhere to be found.

A CASE STUDY IN PUNDIT PSYCHOLOGY

It’s worth noting that the polls were never that bad for Democrats – something pointed out at the time by many electoral analysts. FiveThirtyEight’s final generic ballot average only had Republicans leading by 1.2%. “Flooding the zone” or not, the polls were never showing a red wave to begin with. So why did media coverage and punditry turn so sharply against Democrats in the final weeks? And why was the red wave narrative so prevalent?

There exist two main culprits.

Memories of 2016/2020

After polls severely underestimated Republicans in 2016 and 2020, pundits clearly adopted an assumption that this would repeat in 2022. A common scene at the time was Republicans gleefully parading polls where they were a few points behind, and absurdly claiming it as evidence that they would win the race in the end. As the assumption went, if the data was good for Republicans, the data was accurate. If the data was good for Democrats, it was clearly suffering from another poll error. This laid the groundwork for extraordinary confirmation bias.

    The data said that a Democratic majority in either chamber was only a hair less likely than a GOP majority. But in order to believe that data, one would have had to accept the fact that polling error could cut either way, which nobody seemed to want to do after the misses of 2016 and 2020. Even professional analysts were not immune from this: the forecasters at Sabato’s Crystal Ball wrote in their final writeup that “…our belief (is) that the eventual Republican margin in the overall House vote is being a little bit understated by the House generic ballot polls”.

    National polling bias is not predictable or consistent. Intuitively, this makes sense: if it was so easy as to just say “add three points to the Republicans every time”, wouldn’t pollsters simply do that?

    In reality, there are a variety of societal and methodological factors that can create polling error, and very few of them are obvious before the election actually happens. But pollsters do try and make adjustments for their last misses so as to not repeat them, and this does mean that error is not predictable.

    Unfortunately, pundits never learned this lesson. Instead, they all spent the 2022 election fighting the last war, convinced that there was a hidden Republican vote that they were all missing once again.

    Red Wave Confirmation Bias

    Democrats believed a red wave was coming. Republicans believed a red wave was coming. History suggested a red wave was coming. But the data did not show that this red wave was coming.

      This didn’t stop pundits from treating a red wave as inevitable. After all, most presidents experienced a massive backlash in their first midterm, and the historically unpopular Biden had already got one in 2021. Additionally, with many pundits based in New York, Florida, and California (which had much redder elections than the nation), they could potentially see a red wave forming right in their backyard.

      Thus, as soon as polls moved redder, pundits jumped to confirm their priors. It didn’t matter that, again, polls showed merely a slightly-red year. Pundits believed a red wave was coming, and interpreted nearly everything as supporting their position, rationalizing anything that said otherwise by assuming a GOP-leaning poll error. Despite what the data showed, news coverage consistently followed this assumption.

      One of the most notable examples of this was from the New York Times. In late October 2022, they released a set of four swing district House polls, with Democrats not only leading or tied in all of them, but winning handily in a Trump district, and by a landslide in another. Yet the initial headline was “Polls in Four Swing Districts Show G.O.P.’s Strength in Midterms”, a decision universally lampooned by election data analysts.

      When pundits were shocked at the 2022 results, the easy scapegoat was to blame the polls. But the truth was always there, right in front of their eyes. There was never evidence of a national red wave. Everyone simply saw what they wanted to believe, and fit the data to match their theories.

      Some of the prevalent data aggregators did not help their cases here. For example, RealClearPolitics added a “point-in-time adjustment”, where their (already-selective) polling average was further adjusted in the direction of the average polling miss between 2016, 2018, and 2020. This essentially amounted to a comical double-unskew: the site was already emphasizing and including a heavier diet of Republican-leaning polls in their averages, and then they were also adding a substantial GOP-leaning error on top of that.

      But that’s not a problem with polling as much as it is a problem with amateur data analysis and confirmation bias. A method that is designed to take only the most Republican numbers will obviously lack the benefit of a true average and will be prone to larger misses. If one looked at only the FiveThirtyEight polling averages and listened to zero punditry, they would have had a very good set of predictions in the House, where the generic ballot polling average was almost exactly correct.

      By our analysis, “un-flooding the zone” would have only “fixed” polling in one gubernatorial or Senate race: Pennsylvania’s Senate election, where there really was a big problem with GOP-aligned polling biasing the average. In every other race, it was a failure of punditry rather than a failure of polling — using polls to tease out a half-point difference between candidates is like using a chainsaw to cut a grape.

      The last point is critical: it is always easy to scapegoat the polls, and there are a number of unscrupulous pollsters that do deserve criticism for methodology and presentation alike. But even the best surveys literally are not designed for the purpose people wish to use them for, especially when the average poll is still off by 5%. They simply cannot allow anyone to confidently infer a clear advantage on the basis of a one-point lead in the averages.

      The lesson people seem to have taken, however, is that the “red wave polls flooding the zone” caused the failure in analysis in 2022, and that this is repeating itself in 2024. But this is simply untrue. The rigorous polling averages and the polling picture never pointed to a red wave, even after including the “red wave pollsters”. Moreover, in 2024, an average of only high-quality, non-partisan pollsters is actually no better for Democrats than a full polling average would be.

      The failure to recognize this is a failure of analysis rather than a failure of polling. Whether you consulted the FiveThirtyEight averages or simply looked at the nonpartisan polls, the only correct way to approach the 2022 election was to view it with a lot of uncertainty and to avoid applying unreasonable priors.

      Similarly, under any lens, this election is also a tossup for both the House and the presidency. Even in the Senate, a Democratic victory shouldn’t come as a huge surprise — Democrats have the same odds of keeping the chamber that Trump did of winning the presidency in 2016. And a comfortable victory for either candidate is extremely possible — a three-point Harris or Trump win in Wisconsin or Pennsylvania is well within the scope of modeled outcomes. If any of this happens, the ingredients were right there all along.

      It’s very easy to blame the data for analytical failures. But it will only serve to mislead, and it’s important to be upfront when we, as an industry, fail to see something coming. Because it is a lot easier to fix our own analytical lens than it is to demand that polls do something they were never supposed to do.


      I’m a computer scientist who has an interest in machine learning, politics, and electoral data. I’m a cofounder and partner at Split Ticket and make many kinds of election models. I graduated from UC Berkeley and work as a software engineer.


      Just another election data guy from New Jersey. Proudly competent at predicting the midterms. You can find me tracking special elections and other election-related data/nonsense on Twitter at @ECaliberSeven.