Nearly 4 years of work and 240+ articles later, I now have my 3danim8’s Blog word cloud thanks to Alteryx and Tableau. That took a long time for me to accomplish because I had to write a lot of technical articles and do a lot of thinking.
The significance of this work isn’t necessarily the word cloud results, word analytics, or even the insights I learned about my writing style (all of which I really liked). The explosive potential of this method, the true innovation of the work, is encapsulated in the Alteryx workflow that generated the data used in the analysis.
The Value of Word Clouds and Word Analytics
Word Clouds may not initially seem like much, but when they are done properly, they really can tell insightful stories. I know because I have done studies like this a few times in my career, and they have always been very labor intensive but typically resulted in me being able to develop keen insights into what was being studied.
In this article, I tell you a story in the first video shown below about how I made my word cloud using Alteryx. By being initially inspired by an R-script written by a very talented guy named Ben Hamner of Kaggle, I was able to achieve something that has eluded me for a very long time. I consider this accomplishment to be very satisfying and significant for a few reasons.
Regarding my desire to do this work, I offer the following story. Last year I asked a co-worker to perform this work for me using the mega-capable system known as IBM Watson. He routinely works on Watson and Watson is exactly designed to do this work, plus a whole lot more.
Initially he told me that it should not be a problem for him to do it, so I sent him the list of articles I had assembled at that time. A few days later, he responded to me by saying that the Watson system he was using was not capable of reading content from internet links like I sent to him. The system he was using could only do text analytics from documents that had been ingested in pdf format.
I must admit, I was disappointed. This is why the work of Ben Hamner appealed to me when I saw it. I knew that Ben had accomplished what I had tried to do about a year earlier. However, when I looked at Ben’s approach, I knew that I didn’t want to take the time to learn all the R-language nuances necessary to successfully perform the work.
Work like this represents doing a retrospective study on yourself. What I mean is that by doing this work, I was performing descriptive analytics on a long-term project that I have been working on for years. Gathering the data to do it successfully, however, was tricky and required the help of a master named Ned.
Compared to what Ben did in R, the Alteryx solution is elegant, more understandable and offers me some serious potential. Now I have to explore that potential to see if I’m right. In the second video shown below, I go deep into the method to explain how the magic worked for me to extract the words I wrote in the articles.
In this article, I will tell you how I had a vision, how I got a little help to fulfill the vision, and how Alteryx crushed a mountain of information to produce a very specific and meaningful word cloud.
Why is this significant? Because for me, writing 3danim8’s Blog has been a labor of love and one in which I’ve made a lot of sacrifices to complete. It hasn’t been easy but it is enjoyable. If you want to explore the history of this blog to find topics of interest, simply click here.
The Word Cloud For 3danim8’s Blog
If you ask me to describe to you what 3danim8’s blog represents, I couldn’t explain it any better than the following figure does. These 27 words have been written by me at least 250 times in the 300,000+ words that I have written in this blog. This means that these words form a consistent theme for this blog.
As I look at each word, I understand the significance of them. Obviously, Tableau and Alteryx are near the top in terms of total usage. The other words signify the intent and spirit of 3danim8’s blog, and for this reason, I like and appreciate each one of them. One day I might add to this article a series of bullets that explain the significance of each word.
The Idea and How I Made It Happen
When I try to describe my passion for the combined power of Alteryx and Tableau, many people cannot fully realize what I now know through experience. This is why I make videos to tell stories of how work like this can be accomplished using these remarkable tools.
In the following video, an amazing example of this power is demonstrated in a relatively small workflow that produces a variety of strategic output. This work and its results will enable me to finish my analysis of the 3danim8 blogging experiment. For this, I am thankful.
If you are so inclined and want to learn and be inspired, watch the video and appreciate what this workflow does. Afterwards, ask me for a copy of the workflow so you can try it for yourself. Once you have it, try to do the same thing for your blog. I guarantee you will learn a lot in the process, much like I have.
How long do you think it takes Alteryx to run this workflow? Remember, it has to hit the WordPress website 240+ times, grab all the files, process all words, do all the operations shown, and write a series of output files.
Well, this runs in about 1 minute. Yep, one minute. I have no idea how all of this can actually happen that fast! Such is the allure of Alteryx.
Inside the Magic of Alteryx: Word Parsing Using Regex
The video shown below goes into the workflow in a step-by-step fashion to explain how the html stream, that is full of complexity and html tags, gets parsed into words. This is the magic of this article.
The following three figures are fairly self-explanatory, but here are three insights regarding me as a writer.
The histograms of word length shown below indicate that I have only one writing style, irrespective of the topic I am writing about. I am not writing articles for any audience in particular. My writing is balanced and honest. I choose the same types of words and word combinations to explain the various topics I’m covering. The remarkable consistency of this figure really surprised me.
36% of my articles are less than 1000 words long and 75% of them are less than 2,250 words long. The longest: about 7200 words. My most frequent article length is between 750 and 1000 words. You will have to wait for the final analysis article to understand how article readership is related to article length.
Number of Words Written
I have written over 300K words, with over 13,600 unique words, numbers and phrases that were used in all of the articles. It looks like I have periods of time in which I exhibit more frequent writing, or longer articles. That is part of the big magic of blogging. Creativity is infectious and it is intermittent.
Now I’m nearing the end of the analysis for the blogging experiment. I will complete a little more modeling, a little more thinking, and then I’m going to wrap it up. I can’t believe the end of that experiment is now so close. It took me over 2.5 years to do the experiment and over 1 year to do the analysis. Thanks for reading and/or watching.