Wednesday, June 18, 2008

Wednesday Math, Vol. 28: Margin of Error and Confidence of Victory

A few weeks ago, I discussed normal distribution and the bell shaped curve. The idea is that you can take a raw score from a test, find the average and the standard deviation and create a z-score that corresponds to the raw score. From there you can see how that z-score relates to the rest of the population. For instance, from the picture on the left, we see the light blue area from -1 to 1 standard deviations accounts for 68.2% of the population, so about two thirds of the data of a normally distributed set will have z-scores between -1 and 1. On the right, it says that if a raw score turns into a z-score of 1.18 standard deviations, about 88.1% will be below that z-score and 11.9% will have a z-score higher than that.

This is also the idea behind the margin of error in polling data. If a poll says margin of error is +/- 3.2%, this has to do with the idea of a confidence interval. If the polling number says 46%, that's the best estimate given that particular sample, but the data says that 38 times out of 40 the true number is between (46-3.2)% = 42.8% and (46+3.2)% = 49.2%. The other 2 times out of 40 are evenly split, 1 time it will be too high and 1 time too low. 38 out of 40 is usually written as 95%. This is the confidence interval number, though few newspapers take the time to explain this, the New York Times being a major exception. This is measuring the center, like the multi-colored picture on the left.

I came up with a different way to look at the data in a two way contest that I call the Confidence of Victory number, which measures a left section and a right section of the normal curve. It works best when the top two vote getters are pulling in 95% or more of the votes, but is still useful even if they are getting at least 85% of the vote. Let me do an example from a recent Obama-McCain poll in Virginia.

Obama out-polled McCain 45% to 44% in a poll of 500 likely voters, with 11% undecided or voting for some other candidate. Since the two numbers add to 89%, we can use the Confidence of Victory method. Multiplying the percents with 500, there were 225 Obama voters and 220 McCain voters. In this new smaller sample of the decided, Obama has a lead of 50.6% to 49.4%, and the standard deviation is square root of (50.6% * 49.4%/425) = 2.4%. This means Obama's z-score is (50.6% - 50%)/2.4% = .24. A z-score of .24 corresponds to a percentage of .5948, or about 59.5%. This says that if the election were held today, this poll result as close as it is, still lets the pollsters say with 59.5% confidence that Obama will win Virginia's 13 electoral votes, while McCain's confidence of victory number is 40.5%.

After both conventions, if there isn't a serious third party candidate pulling in a lot of votes, I'll be keeping track of the confidence of victory numbers in all 50 states and Washington, D.C. to keep my loyal readers abreast of how the electoral college battle is going, since that's what a presidential race is all about. But today, I'm just 'splainin' a particular math idea I had, because this week, It's All About Me™!

8 comments:

Karlacita! said...

Me likes - but I still have a question about this polling data, or any polling data in this age of the Do Not Call list.

I've been in a telephone political poll exactly once in my entire life. I've never been polled in any other way, and I don't know anyone else who has.

Who is answering these polls? Is it just people who don't know about Do Not Call lists? If so, are they on the waaaaaaay left side of the curve? If so, how would one correct for that?

And is there data comparing the polls to the actual outcome?

Nonhypothetical Question Asker wants to know!

FranIAm said...

karlacita- as someone who may or may not have worked for a company that may or may not have done research over the phone let me say this about the do not call lists.

they don't apply to research or polling companies... just telemarketers.

also companies that you have done business with do retain some "rights" to be able to call you.

maybe.

Distributorcap said...

this of course assumes all the data is normal
'
people who will vote for mccain are not normal...hence the bell curve might be more like a ding-dong

8-)

Matty Boy said...

Take it from FranIAm, sister mine.

Or don't. No pressure.

There is the matter of cell phones. I've heard that polling companies were forbidden from calling the cellies, which means the available public would skew towards the ALL YOU KIDS GET OFF MY LAWN crowd.

Like me.

Maybe FranIAm could enlighten us. Or not. Don't want to breach any confidentiality agreements. That she may or may not have.

no_slappz said...

matty boy,

Neither Judith Miller nor Josh Wolf (if he is the person I think he is) were jailed for BLOGGING or expressing opinions.

So if your goal was to compare apples and oranges, okay, you succeeded.

Wolf, it would appear, was aiding and abetting criminals. Miller's problems go into different territory.

no_slappz said...

matty boy,

I will bet -- max one dollar -- that John McCain wins the presidency.

I'll spare you the reasoning, but say only that this will be the election where the early polls miss by a mile.

In short, Obama will become the McGovern-Carter-Dukakis candidate wrapped in one taco shell.

He's perfectly decent fellow. But, to paraphrase Al Capone (Robert DeNiro) speaking to Eliot Ness (Kevin Costner) in the Untouchables "He's got nuthin."

Matty Boy said...

Josh Wolf is a blogger and journalist. He was jailed for not giving up sources, much like Judith Miller was. She had the New York Times behind her; he's on his lonesome. The government's position in both cases is the aiding and abetting of criminals. The defense's position in each case is freedom of the press.

Josh Wolf DID go to jail for blogging. Had he published nothing, the cops and the grand jury wouldn't have come after him.

As for your bet, I will decline for two reasons.

1. I've made no prediction, I've only made my preference known.

2. I don't trust you to pay up, seeing the steady level of dishonesty you have shown over the few months I have had the displeasure of reading your crap.

The only bet I might take is that I get to guest blog on your blog for a week if Obama wins, and you get to guest blog for a day if McCain wins. I offer this uneven payoff because I don't think you have many readers.

no_slappz said...

matty boy,

If my house were burglarized and I learned from reading the neighborhood blog that one of my neighbors videotaped the breaking & entering, I'd expect him to give the tape to the cops.

My Brooklyn neighborhood does have a blog and one of my neighbors has a video camera aimed at the sidewalk in front of his house. A couple of years ago, a murder was committed in daylight in view of the camera. My neighbor willingly gave the video to the police.

Remarkably about a month before the murder, a vicious mugging occurred in the same spot in the late afternoon on a sunny day. As with the murder video, the homeowner was quick to help the cops who then supplied the video to the local news channels. I believe the video led to the capture of the two muggers.

Capture became irrelevant in the murder case. It involved a drug deal in which the buyer decided stabbing the dealer in the neck with a knife was a better idea than paying for his purchase. The dealer died. Apparently the killer suffered enough remorse after learning the dealer was dead that he committed a public-service suicide to close the case.

Anyway, since Josh Wolf and my neighbor were not the perpetrators, there is no reason for either to refuse to help. I wouldn't expect Wolf or any other video-taper to provide evidence of his own guilt, but this is something else.

Again, he wasn't jailed for BLOGGING despite your assertions.

Apparently he made a public statement -- undoubtedly available to every person on the planet with an Internet connection -- that came to the attention of the police. Thus, he landed in hot water because he publicly implicated himself in the case.

The only principle he's defending is one he created -- his principle that citizens have a right to protect criminals and help them evade capture.

Of course if he'd kept his fingers off the keyboard for a while, the police would not have known he had compelling evidence.

Martha Stewart got herself in hot water because she lied to a federal agent. At the federal level there are no Miranda rights. She didn't know that and exposed her imperious attitude toward the agent who questioned her, and then she lied to him. That's a prosecutable offense, which she discovered the hard way.

If she had kept her mouth shut from the start or told the agent to speak to her attorney, she would have walked away from the situation without a care in the world.

As for the bet, we can drop the one-dollar cash component. You don't have to declare a choice. You can have the option of razzing me if I am wrong, or ignoring me if I am right.

As for the payoff, I accept your blogging offer.