Thursday, September 16, 2010

Wednesday Math (one day late), Vol. 127: A nice website with a small problem

Okay, there's this website called Maps 'n' Stats, a great source of demographic info for the 50 states and D.C. I particularly like how fine a demographic split they have, so you can see in what states they are breeding like rabbits and in what states they are dying like flies.

For example. Utah. Both Rabbits and Flies. Lots of babies, not so many old people.

So far, so good.

But if we add up all the percentages for Utah's demographics we get 99.2%. In California, the numbers are different, but the sum is also 99.2%. In Texas, the sum is 99.3%.


The website is nice enough to give us the raw numbers, so I plugged the Utah numbers into my spreadsheet to find out why the totals were so far off from 100% and consistently low.

It appears that the nice folks who run never learned about rounding up. All the numbers are simply truncated, so 9.38% is written as 9.3% instead of 9.4%.


As a grateful educator, I hope the folks from that website fix this bug in the near future. It's so easy to get this right.


ken said...

So is there an explanation why some of the slots are 5 years wide and others are 10? It makes it hard to really see the distribution.

Matty Boy said...

Not really. I understand splitting the younger ages up into five year blocks, but I'm mystified why it's 55-59 and 60-64 instead 55-64.

I'm making my stat students merge some of the five year blocks together.

Ron said...

What is a good way to combine the five year blocks for the younger ages when there are five of them because how can you combine them two at a time?

Also some of the details will get lost when you make your stat students merge the five year blocks together. Is that a wise thing to do?

Is there any way to just leave them at five years and ten years and still really see the distribution?

Matty Boy said...

My idea is to leave 0-4 as a five year block and merge the other five year blocks to make 5-14, 15-24 and 55-64. This means the first bar and the last bar are special cases while the bars in the middle are comparable one to another.

There are those who like to make the area of a rectangle proportional to the percent and make the width of each proportional to the number of years that define a block. I tried to teach the method, but I always ran into horrible resistance. This is probably because I was taught the method by a guy whose pedagogy is weak. One obvious problem is that it makes it look like each year in the block is equal to all the others when we know that isn't true. And most importantly, how wide should the 85+ block be?

I know I haven't seen the optimal way to do this, but I am confident that all the ways I've seen so far leave much to be desired. Something trapezoidal is more likely to be valid, but I wouldn't try to teach it to intro stat students unless there is some brilliantly easy way to put it in Excel. Maybe the last block should decrease logistically, but that's a hard idea added to what should be a easy concept.

The search continues.

Ron said...

I didn't realize it was so complicated. Thanks for 'splainin'. I like your idea of combining two five year blocks to make ten.

And hey, sorry you got stuck with somebody with weak pedagogy teaching you. Rotten luck. Couldn't you just explain this to him?

But can you help me with this? I don't get that for the ten year blocks in the picture from the website, doesn't the picture make it look like each year in the block is equal to all the others in the same block, because the whole block is even, but like you say that isn't really true? So is it really okay to draw any blocks at all, like they did in the website? Please help me out ...