In Part 4 of this new Alteryx-user series, I utilize the Alteryx/Tableau techniques shown in Parts 1 – 3 in a comprehensive example using data from the National Football League (NFL). I chose the NFL because the regular season begins in about a week and Americans love this sport (myself, included). Once I crunched the numbers, there were a few surprises that emerged.
I wanted to explore NFL franchises to see how much money they operate with annually, how the total franchise valuations vary, how ticket prices range across the teams, and a few other things. I was particularly interested in seeing if any of these measures could be related to demographic information from the regions where each team plays. I also wanted to learn a little bit about the businesses around stadiums, so I chose to investigate restaurants around each stadium location. All of these questions were answered in a few hours of work using Alteryx and Tableau, as you will see in this post.
If you like this article and would like to see more of what I write, please subscribe to my blog by taking 5 seconds to enter your email address below. It is free and it motivates me to continue writing, so thanks!
I started the project with a few look-ups on the internet which took only a couple of minutes. I found databases for all 32 stadium locations, 2013 average ticket prices, and various financial measures for each team. I assembled these data into one data table in Excel. I used this table to send information to Alteryx. The 32 NFL teams that were investigated are shown with their logos in Figure 1.
Where Can You Eat Before The Game?
I used the stadium latitude and longitudes to calculate 30-minute drive time polygons around each stadium. This assumption probably works pretty well for most teams, but for teams like Green Bay, people travel much further to attend the games. However, for this demonstration, this assumption is just fine. Figures 2 and 3 show the drive time polygons for each of the 16 NFC and AFC teams. These polygons are used throughout this work to extract business and demographic data as described below.
These polygons were drawn in Tableau and a special technique was used to plot the team name at the centroid of each polygon. This technique involves using dual axis maps in Tableau and click here if you want to learn how to do this.
Using these polygons as “cookie-cutters”, I extracted family restaurant data from the Dunn and Bradstreet database using SIC code 58120500, as shown in Figure 4 for each team polygon.
Figure 5 shows how restaurants are related to the drive time polygons.
After removing duplicate data from the restaurant database, I calculated the drive-time distance from each stadium to each restaurant, as shown in Figure 6.
Once these calculations were finished, I created a Tableau dashboard to visualize and investigate the restaurants around each stadium (purple dot), as shown in Figure 7. Figure 8 is a tabular listing of restaurants that correspond to your search radius around the stadium. An interactive dashboard was created with hyperlinks to the restaurant websites, as well as dashboard to visualize the 30-minute drive time polygons. Click here to access the Tableau public workbook that contains this information for all the teams and their stadiums.
Financial and Demographic Analysis of NFL Teams
Using the 30-minute drive time polygons to extract information, I did a scan across about 2,600 demographic variables from a variety of databases that are accessible in Alteryx. Figure 9 shows the databases I chose for this analysis. Note that not all variables in these categories were selected. If that had been the case, I would have been looking easily at more than 10,000 variables.
The demographic variables are obtained from the databases as shown in Figure 10.
After Alteryx retured this data in CSV form as shown in Figure 11, I did a bit of work on the data to allow me to visualize certain things. Of course, I used Tableau to accomplish the visualizations.
To start the analysis, there are five key financial measures that I assembled from the internet search. Figures 12 and 13 show these measures for each of the 32 teams. The measures include current valuation, annual operating revenue, operating income, average ticket prices and debt/value ratios.
The objective of using Alteryx to extract demographic variables was for me to be able to identify variables that show both positive and negative correlations to the five key measures shown in Figures 12 and 13. To start, I looked across categories of information such as Occupation and Employment as shown in Figure 14. Although all of the variables shown in Figure 14 showed positive correlations, Figure 15 showed how some categories are negatively correlated to the key measures. A Tableau Public workbook has been created to allow users to look for demographic correlations to the three key measures shown in Figures 14 and 15.
Once I determined which variables warranted further investigation, I used Tableau to determine how these variables related to any of the key measures. I created a Tableau workbook that allows me to create scatter plots of any of the five key measures (selected via a parameter) compared to any of the 2600+ demographic variables. For example,ticket price data is compared to two different demographic variables as shown in Figures 16 and 17. A Tableau Public workbook has been created to allow users to examine how demographic variables relate to any of the five key measures that are available in the database.
What Does All of This Mean?
In this post, I used the techniques I previously outlined to combine Tableau with Alteryx to make it easier for me to extract and examine data and then to quantify data relationships. This is how I, as a new user of Alteryx, have learned to use the tool effectively by using Tableau to supercharge my new Alteryx-user experience.
Although I haven’t spent a lot of time reviewing the variable correlations, there are a few question related to this data that I though are interesting and surprising.
- How can the Bears charge an average ticket price of $446 per game, and why are people willing to pay this much to see a game? I have to wonder if this data is accurate considering how much higher their ticket prices are compared to all other teams.
- Why are the Cowboys worth so much? How do they generate that much revenue? What are they doing differently than the other teams to be this financially successful?
- The 49ers already had a problem on their hands with a debt/value ratio of 53! I don’t know if this includes the cost of their new $1.3B stadium or not, but they better get the sod onto the playing field before next week! Their ticket prices have just jumped 45% with the new stadium (the fans are irate), their offense is in the “tank” this pre-season, and players are unhappy with the coaches and management. Trouble is looming in SF and it isn’t just because of the potential for larger earthquakes.
- The Packers have the best debt/value ratio of 1. They seem to do so many thing right as an organization, including avoiding debt and making wise draft choices. With the fans owning stock, intelligent management and coaches, the Packers appear to me to be a great organization.
- Why are New-Yorkers willing to pay almost $100 per game more to see a Giants game than a Jets game?
- The household income correlations that I looked at were very interesting. Go investigate them for yourself using the Tableau workbooks provided.
Upcoming in Part 5 of This Series
I’m going to discuss how you can identify and use Alteryx to eliminate data quality problems.