Over the past couple of weeks, I wrote two articles that explained how I used daily max and min air temperature data to visualize changes in air temperature over time. This is the third article of this series.
I am doing this work so that I can improve my comprehension of the general topic of global climate change. I have no motivation for this work other than self-education and to continue to improve my analytics skills.
Here are the links to the previous two articles:
This data represents real measurements recorded at thousands of monitoring stations across the world, with some data going back to the early 1800’s. This data is not theoretical, nor is it simulated data produced by computational models.
This data is what has really happened to the temperature of our air on planet earth, so there is no way you can argue its validity. This data is carefully collected, reviewed, managed and published. All data that had any quality control issues has been removed from this analysis.
My usage of this data is for descriptive analytics purposes. I use quantitative methods and visual analysis to comprehend the data. I am not trying to determine how or why the data looks like it does.
If you have any scientific curiosity about climate change, read this article and watch the videos. I think you will learn a thing or two about what is happening on our beautiful planet with respect to changes in air temperatures.
This data shows that global warming doesn’t mean that every place on the planet is warming uniformly across time, or that all areas are even experiencing warming. This is what I am learning from this work.
Global Warming Innundation
A few minutes before publishing this article, once again I saw a headline on CNN that indicates that for the third year in a row, a new global heat record has been set. Also, 16 of the 17 hottest years have occurred since 2000. This is one of the reasons I am focusing on comprehending the past 50 years of data. Here is a link to the article.
The following two figures present the findings from this article. These types of articles are one reason I am doing the work needed for me to understand this situation. The article states that we are now 1.1 deg C (1.98 deg F) above pre-industrial temperatures. I want to see for myself how the actual data collected around the world compares to this number.
A couple of hours after publishing this article, another article on this topic appeared. This article discusses the causes of warming that is being detected. Here is a link to that article.
Additional Data Processing
I recently used Alteryx to perform many hundreds of millions of more operations to expand and improve my understanding of how daily max and min air temperatures are changing over time. As usual, Alteryx did the job perfectly, without hesitation or complaint. I have also added some capabilities to my existing Tableau dashboards and I’ve built some new ones.
I expanded the monitoring network from the original 100-year network of 1788 stations by adding additional stations that had data starting in 1960. Now there are nearly 4400 stations containing Tmax, Tmin and precip data, with each data set occupying about 11 to 12 GB. In total, there is over 50 Gb of daily data in this worldwide data set, with nearly 20,000 files and more than a hundred million daily readings each for the five weather variables.
To achieve a few of my objectives, I had Alteryx perform monthly aggregations of Tmax, and Tmin, which created the primary data file I will be using in this article. This file is 4M lines long and is 534 Mb in size. Click here to retrieve this csv file in zip format (74 Mb).
I built several additional workflows to process the monitoring station data, directly from the flat files, in a more comprehensive way. One day I’ll publish all of the workflows used to complete this work, but for now, some of the workflows are shown below.
For any figures like the one shown below, you can click on the figure and full-screen versions of all the figures contained within the collection will be made available. You can easily scroll left and right through them.
How I Use The Dashboards To Comprehend the Data
In the two videos below, I demonstrate the usage of the dashboards to show some of the findings I have been trying to understand. In the first video, I make a simple mistake when showing the results for March. I said the data shows cooling, when it actually shows warming. I didn’t want to remake the video.
I had some fun creating a series of dashboards to document the max air temperature changes that have happened over the past 50 years. These have really helped me understand how much temperatures are changing and whether or not the monthly observed trends are continuously increasing or variable. I am very interested in seeing if this data can be used to make predictions and/or to do some additional strategic computations.
I used a fixed set of monitoring stations to avoid any temperature creep that could happen by the introduction of new monitoring stations over time. I generated dashboards for the following conditions:
- The entire world (3983 stations), every month discrete (Jan, Feb, etc), and all months together
- The entire United States (2322 stations), for selected months, and all months together
- Selected locales for various months, including Antartica, Texas, Nebraska, and Alaska for selected months. I also included a plot of the 1440 stations from 1900 forward to 2010.
If you take the time to look at the figures, you will see how interesting the trends are for Tmax over the past 50 years. Each month has unique characteristics that have helped me understand the changes that have occurred over time.
Set 1 – The Entire World
Set 2 – The Entire United States
Set 3 – Miscellaneous Locations
I found it very interesting that it has not been getting hotter in July in Texas. I guess it is hot enough already! I was also keen to see how much Nebraska has been heating up. The 6.5 degree increase over 50 years in March is very obvious, and 2012 was a scorcher for that state.
Using Tableau Level of Detail Calculations
In the third video, I discuss the usage of Level of Detail (LOD) calculations in Tableau. I chose to document this because it shows how powerful and versatile this approach can be when used on large data sets like this one.
The following two figures show the LOD calculations discussed in the video. The second calculation is used to produce the change in Tmax over time relative to the first decade selected in the dashboard.
I have envisioned using this data to do some more quantitative analysis using Alteryx. I’m not going to tell you what it is, but I’m excited to see if my concept is possible to complete. Stay tuned for more of this work.
If you want to use any of these dashboards, they are available on my Tableau public site and I’ll be adding more as I complete them.