Friday, November 9, 2012

How hard is it to beat Nate Silver?
Apparently, not that hard at all.

I've been doing my election snapshot system which I call Confidence of Victory for three presidential elections now, 2004, 2008 and 2012. I have made small adjustments to my system over that time, but always in the spirit of mathematics, trying to correct problems in the simplest, most elegant way possible. Many other people do this, too, but they use the statistical method, which usually involves adding "fudge factors" to calculations. Even Sam Wang at Princeton, whose method most resembles mine, does this. Statisticians do this all the time and mathematicians, especially someone like me trained as a pure mathematician, avoid that kind of fiddling with numbers like it's the plague.

To make a comparison, it would be like journalists changing quotes. It's dishonest and unethical. Even worse, the best mathematical system can avoid this nonsense and make better predictions, so it's also completely unnecessary.

A fellow named Dean Chambers did a lot more fudge factor work and came up  People paid attention to him. His method was praised by that great mathematician and physicist, Governor Rick Perry of Texas.  That should have been an early clue this guy had his head up his ass, but conservatives really did buy this, as I found out in a short back and forth with a conservative Facebook friend of my good buddy Padre Mickey, now Doctor Padre Mickey. El Doctor Padre is not conservative. As he used to say, he is only a registered Democrat because Sandinistas don't get to vote in the primaries.

I started my system in 2004. I didn't have a blog then, but I posted online to a website no longer available. My last prediction online was recorded by the Wall Street Journal and I thought Kerry had a 74% chance on Sunday because both Ohio and Florida were barely in his camp. Monday things changed a little toward Bush as Florida moved into his column but I didn't post it and it was past the Journal's deadline.

I did remember it, though. I made sure my last prediction would always include the last polls.

While my system has been refined, a great advantage is how much more data I have to work with now. The worst thing that can happen is to rely on a single poll in a difficult to determine contest. In battleground states now, there is no way they will be "underpolled".

In 2008, I had a blog. I only posted numbers on my blog on Sunday because by October there was no excitement in this race, unlike this year when the press in general and the conservative press in particular thought this thing was neck and neck, but I did update with a final post.I had it at 353-174, with Indiana a flat footed tie in my system.

In 2008, Nate Silver was "just a blogger", but he was actually connected to Daily Kos. As you can see, his last guess had Indiana a light red, and his final numbers were 353-185.

The real result was 365-173.  Everything I put a number on was right, and Nate messed up in Indiana. We both missed District 2 in Omaha for that last electoral vote, but I didn't do the work on it and he did and got it wrong.

My final result. 50-0 with one too close to call.

Nate's final result. 54-2, all that I had right, right in the Maine districts and two Nebraska districts but screwing up in Omaha and Indiana. Where we both said something, he went 50-1-0 and I went 50-0-1, where the order of the numbers is wins, losses and abstentions.

I didn't do the Senate that year.

So here's the two election totals. I made predictions in 135 races and went 133-1-1. My system said Florida was too close to call this year, but I went all stupid and tried to be a pundit. I guessed Romney and now it's Saturday morning in Oakland and I was wrong. My system said no comment, but I made a comment and and my word is my bond. Nate Silver's system said slight advantage Obama, so he beat me there.

In those 135 races, Silver never abstained and went 132-3. Two more mistakes and one less correct pick, he has yet to abstain.

Nate's system works really well, but mine works better. We almost never disagree, but when we do this is what happens. If I abstain, it's because the poll in the middle says the race is a flat footed tie. If I make a statement one way or the other, it's because a majority of the recent polls I counted favored either Obama or Romney.

When Silver's system disagrees with mine, there is some poll in the minority that is an outlier, screaming at the top of its lungs a result almost no one else agrees with, at least in degree.  Because I use the median, the screaming outlier just gets a single vote like everyone else.  Nate's system gets pulled away from the opinion of the majority and into the minority.

That might be to his advantage sometime, but that's not the way to bet.  In the long run, the Law of Large Numbers is on my side.  A casino doesn't get lucky when a roulette player finally goes broke. The way a casino gets lucky is when the roulette player takes out the wallet and decides to play. In the long run, the house wins, and my system is the house and Nate's isn't.

Math loves beauty as well as truth. Stats tinkers and fidgets and gets fooled by randomness way too often. So it was in the beginning and unless statisticians wake up, so it shall be at the end.

Here endeth the lesson.

1 comment:

Fran said...

There are many reasons to love the power of the Matty-based numbers! Not to mention that I knew and trusted Matty long before the name Nate Silver entered my head. And after. As someone who sucks at the maths, I need you and your careful analysis. Thus endeth the comment!