A few weeks ago, I discussed normal distribution and the bell shaped curve. The idea is that you can take a raw score from a test, find the average and the standard deviation and create a z-score that corresponds to the raw score. From there you can see how that z-score relates to the rest of the population. For instance, from the picture on the left, we see the light blue area from -1 to 1 standard deviations accounts for 68.2% of the population, so about two thirds of the data of a normally distributed set will have z-scores between -1 and 1. On the right, it says that if a raw score turns into a z-score of 1.18 standard deviations, about 88.1% will be below that z-score and 11.9% will have a z-score higher than that.
This is also the idea behind the margin of error in polling data. If a poll says margin of error is +/- 3.2%, this has to do with the idea of a confidence interval. If the polling number says 46%, that's the best estimate given that particular sample, but the data says that 38 times out of 40 the true number is between (46-3.2)% = 42.8% and (46+3.2)% = 49.2%. The other 2 times out of 40 are evenly split, 1 time it will be too high and 1 time too low. 38 out of 40 is usually written as 95%. This is the confidence interval number, though few newspapers take the time to explain this, the New York Times being a major exception. This is measuring the center, like the multi-colored picture on the left.
I came up with a different way to look at the data in a two way contest that I call the Confidence of Victory number, which measures a left section and a right section of the normal curve. It works best when the top two vote getters are pulling in 95% or more of the votes, but is still useful even if they are getting at least 85% of the vote. Let me do an example from a recent Obama-McCain poll in Virginia.
Obama out-polled McCain 45% to 44% in a poll of 500 likely voters, with 11% undecided or voting for some other candidate. Since the two numbers add to 89%, we can use the Confidence of Victory method. Multiplying the percents with 500, there were 225 Obama voters and 220 McCain voters. In this new smaller sample of the decided, Obama has a lead of 50.6% to 49.4%, and the standard deviation is square root of (50.6% * 49.4%/425) = 2.4%. This means Obama's z-score is (50.6% - 50%)/2.4% = .24. A z-score of .24 corresponds to a percentage of .5948, or about 59.5%. This says that if the election were held today, this poll result as close as it is, still lets the pollsters say with 59.5% confidence that Obama will win Virginia's 13 electoral votes, while McCain's confidence of victory number is 40.5%.
After both conventions, if there isn't a serious third party candidate pulling in a lot of votes, I'll be keeping track of the confidence of victory numbers in all 50 states and Washington, D.C. to keep my loyal readers abreast of how the electoral college battle is going, since that's what a presidential race is all about. But today, I'm just 'splainin' a particular math idea I had, because this week, It's All About Me™!