Benchmarking Data Reshaping in #Alteryx

Introduction

I have written a few articles about the necessity of data reshaping when you do a lot of work in Tableau. To read my most comprehensive article, click here. In that article, I show several examples of data reshaping, with the examples increasing in complexity throughout the article.

Background

About the time that I wrote that article, I started learning how to use Alteryx. In the following months, I have grown to understand the power that Alteryx gives me in my day-to-day work. Since data reshaping can be a time-consuming activity, I had a feeling that performing data reshaping in Alteryx might be advantageous over using the Excel add-in Tableau reshaping tool (even though I love this tool). The upcoming Tableau 9 version also features an internal reshaping technique that I included in the test.

The Reshaping Test

To test my intuition, I spent a few minutes investigating this concept. I took an existing project file that I previously reshaped in Excel, using the Tableau reshaping tool. The file contained 362 rows and 1462 variables (columns) that needed to be reshaped. This example isn’t a big file by the standards of my work. It is not uncommon for me to have to reshape files that have 5 to 10 thousand columns of data.

There were three parts to the test. First, I completed the test using the Excel reshaping tool that has been available to Tableau users for several years. Second, I used Tableau version 9.0 Beta to reshape the data using the “pivot” method that is now available. Last, I used Alteryx to reshape the data.

The Tableau Excel Add-In Reshaper Results

The Tableau reshaping tool (an Excel add-in) took just over two hours to complete the task. Afterwards, the reshaped file had to be loaded into Tableau, which took a couple of additional minutes. The total time was about 2 hours and five minutes (approximately 7,500 seconds). This work was completed in the Excel 64-bit version.

The Tableau 9.0 Beta Pivot Reshaper Results

The Tableau 9.0 Beta “pivot” reshaping tool (an new Tableau 9 feature) took 4 minutes and 29 seconds to complete the task. The breakdown of this time was: 1:10 for the initial data connection (query), 1:33 for first-time load operations, about 40 seconds for selecting the 1462 columns to pivot, and 1:07 to complete the pivoting operation.

The Alteryx Reshaping Workflow

The Alteryx workflow shown in Figure 1 took me about 2 minutes to create. It took 11 seconds to run and produce the Tableau data extract file. The total time was 131 seconds.

Transpose

Figure 1 – The three-step workflow for performing a data reshaping operation in Alteryx.

 

The Results

Figure 2 compares the results from the three approaches. It is clear that the Alteryx workflow approach is much more computationally efficient than the original Tableau reshaper tool. The Tableau 9 reshaper tool is a huge improvement over the original Excel add-in reshaping package and gives us an efficient method for reshaping most data (finally!). For pure speed and configurability, however, Alteryx is still the reshaping king because it is specifically designed to do these kinds of operations.

Figure 2 – Benchmark results for data reshaping in Alteryx, Tableau 9 Pivot Reshaper, and the Excel add-in data reshaping tool. The Tableau software engineers developed an outstanding reshaping tool for version 9, with an improvement from 7500 seconds to 269 seconds for the reshaping operation. That is a very impressive and significant capability being directly added to Tableau version 9.

 

It is not uncommon for me to be simply amazed by the time savings I realize when using Alteryx to perform data operations on a daily basis. Now Tableau unleashes this fantastic 9.0 enhancement for data reshaping. Tableau continues to blur the line between Extract-Transform-Load (ETL) operations and data visualization with this move. With this new capability being directly added to the data connection interface of Tableau, all Tableau users will experience a tremendous benefit when they need to perform a data reshaping operation. I am very impressed by Tableau 9.0 so far.

Final Thoughts

Alteryx is still likely to be my tool of choice for reshaping big, complex data sets. However, the new “pivot” reshaping operation added to Tableau version 9.0 beta is outstanding and will serve all Tableau users well into the future. The test conducted in this post is much bigger than most Tableau users will likely attempt.

Finally, Alteryx is the only method tested that can be configured to accommodate certain types of complexities in the data set. This control makes Alteryx an invaluable tool for data reshaping. An upcoming blog post will discuss this in more detail.

Much thanks to Joe Mako for suggesting to me today to add the Tableau 9.0 “pivot” reshaping test to this blog post. As usual, Joe’s comments made the article better and that is why I like to hear from him.

 

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s