Saturday, September 16, 2017

An idea for linking weather and climate

There is a difference between weather and climate, but as a mathematician I wish the demarcation point was better defined. Weather tends to deal with localities or small regions and time periods ranging from a day to several days. Climate usually discusses large regions or even the entire globe over longer periods of time. Climate scientists have decided a month is the smallest reasonable length of time when talking about climate. If I had a say, I'd prefer a season or a half year to be the smallest useful unit, where a half year would begin on the first day of spring, either in the Northern or Southern hemisphere, which means either late March or late September. This would argue that years should start on one of these dates, but that's too much to ask.

Whatever units of time are used, the numbers make a solid case that the surface temperature of both the land and the sea are getting warmer over time. The data does not show a constant rise, each day warmer than the last or even each year warmer than the last, but the trend over time is upward using any standard mathematical measure of a data set.

The idea I present today is currently in the hypothesis phase, as I have only done a little bit of data from a single weather station. I chose the Oakland Airport stations because... well, I'm from Oakland. The data did what I expected more or less, but for this to be fully fleshed out, I need to get a programming language on my computer and take a rip into a very large data set.

Here is my methodology.

1. Take the daily data from a baseline set of years for a single weather station. Climate scientists right now are using 1981-2010 as the standard baseline, so I used that set as well.

2. Using that set, get an average temperature and a standard deviation for each day. If I have any quibble with this method, I would say February 29 is getting the short end of the stick, as there were only seven leap years in the set instead of thirty for every other date. In practice in the Oakland data set, the average and the standard deviation for Leap Day are not out of line with the other nearby days.

The Excel data for 2013

3. Input the daily data from any year and get 365 or 366 z-scores. The numbers on the left are the z-scores from 2013, one of the warmest of the recent years but by no means the record holder. Cells with red backgrounds and borders are the z-scores greater than 3, which makes that day very unusually hot for that data set. The z-scores in red with no border are over 2 but under 3, so they are unusually hot. Two days are marked in green, they were unusually cold, which are z-scores under -2 but greater than -3. No days in 2013 had z-scores under -3.

4. A high z-score is not crazy hot by human standards, just crazy hot in context. For example, December 29th (ahem, my birthday) was 72° F in 2013, which right-thinking people would regard as "a nice day". The thing is, it is not normal for the weather to be that nice on December 29, as I can remember with some clarity. This example counts as a very unusually hot day.

5. Show the data from a weather station as the average temperature for the year and the number of days in each of four categories: Very unusually cold, unusually cold, unusually hot and very unusually hot. That's the what graph below shows for six years in the 1960s and six years in the 2010s.
Comparing the 1960s to the 2010s in terms of unusually warm and cold days

Okay, let's take a look at the data year by year.

1961: This is the warmest year in the 1960s in our Oakland Airport data, and the average is exactly the same as 2011, the coldest year in 21st Century set. 1961 holds the record with 18 days that are very unusually hot for that particular day of the year, but if we rank the years by (# of warm days) - (# of cold days), it has the highest number in the 1960s with 30, but would still be outranked by five of the six measured year in the 2010s.

1962: 1962 turns colder than 1961 and there are only 25 unusually hot days this year, with 8 unusually cold.

1963: 1963 is colder still, and the number of cold days is greater than the number of warm days by our measuring standards.

1964: 1964 is the only year on our list with no very unusually hot or very unusually cold days.

1965: The coldest year of the twelve on the list, it is dead last on the (warm days) - (cold days) ranking system at -5. It is also dead last in total number of unusual days with 16.

1966: 1966 warms up slightly in comparison to 1965, but as the chart shows, all its entries are about the size of Trump's fingers.

2011: The year most like 2011 is 1961, but most noticeably, it starts the 21st Century trend of no very unusually cold days.

2012: 2012 is only a little warmer than 2011 and the bars are unimpressive by 21st Century standards, but it is the first year on the list with no unusually cold days whatsoever.

2013: And now it gets warm. In terms of bar heights, 2013 is most like 2011, even though the average temperature is 2.25 degrees hotter. This is the most noticeable instance of the imperfect correlation between average yearly temperature and number of unusual days, but that actually makes me happy with the data set. Perfect correlation in naturally occurring data is suspicious in such a simple measuring system.

2014: And now it gets hot. The first of three years in a row with an average temperature in Oakland over 68° F, 2014 has the highest average temperature, the most unusually warm days and zero unusually cold days.

2015: Compared to 2014, 2015 is a reversion to the mean, but it has the second highest number of unusually hot days, the second highest number of very unusually hot days, the second highest total of unusually hot and very unusually hot combined and no unusually cold days at all.

2016: Again, we see the numbers shrinking from the 2014 peak, but still higher than 2013, which had been the highest on the list when it was posted.

To repeat myself here in the conclusion, this is an interesting hypothesis, but it needs more data. I took a very large climate data set and whipped it into shape back in 2013, so this only a matter of me applying myself once more, as well as buying a programming language package for the new computer. You may have read the book How to Be Your Own Best Friend. I must now write yet another chapter in my unpublished tome How to Be Your Own Overworked, Underpaid Grad Student. If I do put this in the pipeline of my many long-term projects, I will likely start a new blog showing the data.

Wish me luck. Or mutter to yourself that this mofo is crazy. Whatevs.


Emphyrio said...

I'm an amateur at statistics, but shouldn't the z-score be based on the mean and standard deviation of the entire population? So if you're looking at 1961 and 2016 at the extremes, the population would be all the years from 1961 to 2016, not just 1981-2010, right?

Matthew Hubbard said...

This is the standardized set used by the climate science community and they have decided it's kosher to compare years not in this set to this average. You aren't wrong, another valid way to define the set would be every year from the earliest to the latest, but using this set of years is my nod the the community standards.

Thanks for stopping by.