Wednesday Math, Vol. 28: Margin of Error and Confidence of Victory

A few weeks ago, I discussed normal distribution and the bell shaped curve. The idea is that you can take a raw score from a test, find the average and the standard deviation and create a z-score that corresponds to the raw score. From there you can see how that z-score relates to the rest of the population. For instance, from the picture on the left, we see the light blue area from -1 to 1 standard deviations accounts for 68.2% of the population, so about two thirds of the data of a normally distributed set will have z-scores between -1 and 1. On the right, it says that if a raw score turns into a z-score of 1.18 standard deviations, about 88.1% will be below that z-score and 11.9% will have a z-score higher than that.

This is also the idea behind the margin of error in polling data. If a poll says margin of error is +/- 3.2%, this has to do with the idea of a confidence interval. If the polling number says 46%, that's the best estimate given that particular sample, but the data says that 38 times out of 40 the true number is between (46-3.2)% = 42.8% and (46+3.2)% = 49.2%. The other 2 times out of 40 are evenly split, 1 time it will be too high and 1 time too low. 38 out of 40 is usually written as 95%. This is the confidence interval number, though few newspapers take the time to explain this, the New York Times being a major exception. This is measuring the center, like the multi-colored picture on the left.

I came up with a different way to look at the data in a two way contest that I call the Confidence of Victory number, which measures a left section and a right section of the normal curve. It works best when the top two vote getters are pulling in 95% or more of the votes, but is still useful even if they are getting at least 85% of the vote. Let me do an example from a recent Obama-McCain poll in Virginia.

Obama out-polled McCain 45% to 44% in a poll of 500 likely voters, with 11% undecided or voting for some other candidate. Since the two numbers add to 89%, we can use the Confidence of Victory method. Multiplying the percents with 500, there were 225 Obama voters and 220 McCain voters. In this new smaller sample of the decided, Obama has a lead of 50.6% to 49.4%, and the standard deviation is square root of (50.6% * 49.4%/425) = 2.4%. This means Obama's z-score is (50.6% - 50%)/2.4% = .24. A z-score of .24 corresponds to a percentage of .5948, or about 59.5%. This says that if the election were held today, this poll result as close as it is, still lets the pollsters say with 59.5% confidence that Obama will win Virginia's 13 electoral votes, while McCain's confidence of victory number is 40.5%.

After both conventions, if there isn't a serious third party candidate pulling in a lot of votes, I'll be keeping track of the confidence of victory numbers in all 50 states and Washington, D.C. to keep my loyal readers abreast of how the electoral college battle is going, since that's what a presidential race is all about. But today, I'm just 'splainin' a particular math idea I had, because this week, It's All About Me™!

Karlacita! said...

Me likes - but I still have a question about this polling data, or any polling data in this age of the Do Not Call list.

I've been in a telephone political poll exactly once in my entire life. I've never been polled in any other way, and I don't know anyone else who has.

Who is answering these polls? Is it just people who don't know about Do Not Call lists? If so, are they on the waaaaaaay left side of the curve? If so, how would one correct for that?

And is there data comparing the polls to the actual outcome?

Nonhypothetical Question Asker wants to know!

FranIAm said...

karlacita- as someone who may or may not have worked for a company that may or may not have done research over the phone let me say this about the do not call lists.

they don't apply to research or polling companies... just telemarketers.

also companies that you have done business with do retain some "rights" to be able to call you.

maybe.

Distributorcap said...

Matty Boy said...

Take it from FranIAm, sister mine.

Or don't. No pressure.

There is the matter of cell phones. I've heard that polling companies were forbidden from calling the cellies, which means the available public would skew towards the ALL YOU KIDS GET OFF MY LAWN crowd.

Like me.

Maybe FranIAm could enlighten us. Or not. Don't want to breach any confidentiality agreements. That she may or may not have.

