I have written a lot about averaging global temperatures. Sometimes I write as a sampling problem, and sometimes from the point of view of integration.
A brief recap - averaging global temperature at a point in time requires estimating temperatures everywhere based on a sample (what has been measured). You have to estimate everywhere, even if data is sparse. If you try to omit that region, you'll either end up with a worse estimate, or you'll have to specify the subset of the world to which your average applies.
The actual averaging is done by numerical integration, which generally divides the world into sub-regions and estimates those based on local information. The global result always amounts to a weighted average of the station readings for that period (month). It isn't always expressed so, but I find it useful to formulate it so, both conceptually and practically. The weights should represent area.
In TempLS I have used four different methods. In this post I'll display with WebGL, for one month, the weights that each uses. The idea is to see how well each does represent area, and how well they agree with each other. I have added some capabilities to the WebGL system, which I will describe.
I should emphasise that the averaging process is statistical. Errors tend to cancel out, both within the spatial average and when combining averages over time, when calculating trends or just drawing meaningful graphs. So there is no need to focus on local errors as such; the important thing is whether a bias might accumulate. Accurate integration is the best defence against bias.
The methods I have used are:
- Grid cell averaging (eg 5x5 deg). This is where everyone starts. Each cell is estimated as an average of the datapoints within it, and weighted by cell area. The problem is cells that have no data. My TempLS grid method follows HADCRUT in simply leaving these out. The problem is that the remaining areas are effectively infilled with the average of the points measured, which is often inappropriate. I continue to use it because it has often very closely tracked NOAA and HADCRUT. But the problem with empty cells is serious, and is what Cowtan and Way sought to repair.
- My preferred method now is based on irregular triangulation, and standard finite element integration. Each triangle is estimated by the average of its nodes. There are no empty areas.
- I have also sought to repair the grid method by estimating the empty cells based on neighboring cells. This can get a bit complicated, but works well.
- An effective and elegant method is based on spherical harmonics. The nodes are fitted with a set of harmonics, based on least squares regression. Then in integrating this approximation, all except the first go to zero. The integral is just the coefficient of the constant.
The methods are compared numerically in this post. Here I will just display the weights for comparison in WebGL.