9/10/2018
Optimizing Tableau Workbooks
If you’re not one of many 100 followers of Katarzyna « Kasia » Gasiewska then you’ll be lacking out on some nice visualizations which can be certain to pop up on her Tableau Public profile. Her very first put up on Twitter was her latest Iron Viz entry Water, Water Everywhere and when she posted a brand new viz final week, it caught my eye instantly. The viz was DID YOU SRSLY NAME ME THAT? and it was the Tableau Viz of the Day final Friday. Once I noticed the picture of the viz I used to be instantly drawn too it, however earlier than I even clicked the hyperlink, I observed her Tweet in regards to the « way-too-long preliminary loading time ». As quickly as I clicked on her viz I felt her ache.
Exploring the Viz
Along with the lengthy loading time (the viz took greater than a full minute to load for me on Tableau Public), clicking between boy/lady names additionally took a really very long time to filter. Once I explored the viz, I did not see something too complicated (a bar chart, a radial bar chart and a dot plot). So I made a decision to obtain the workbook and take a look below the hood. At that time the ache elevated considerably. It was like going to a medical practitioner that requested to charge the ache on a scale of 1 to 10, and this viz was ranking excessive on that scale. It was sluggish sufficient loading and interacting on Tableau Public, however when downloading and truly attempting to maneuver drugs round to make adjustments to the viz, it rapidly reached the highest of my ache scale.
At that time, I felt dangerous for Kasia, as a result of I do know the hours that may go into designing a viz like this and I might simply think about her attempting to construct this viz, one step at a time, and having to attend, minute by minute, for the viz to replace when she adjusts the shapes, colour, or the scale slider, or make another change to her viz. We traded a number of messages and she or he has up to date her viz, however I made a decision to jot down this put up to explain some issues that I did to assist enhance the pace of her visualization and provide a number of hyperlinks and sources for others that may wander into this poor efficiency zone.
If you want to look at this for your self then you may download her original workbook here..
Evaluation of the Workbook
There are many sources on the market on « tips on how to optimize your Tableau workbook ». I might encourage you to learn a bit on the topic, even in the event you’ve by no means encountered most of these efficiency points. Understanding only a few ideas and methods can save an enormous period of time in the long term.
Under are some things that I did to assist enhance this workbook. Observe – this workbook will be optimized additional. Kasia made these enhancements plus a number of others and her viz is now a lot sooner than her initially model
1. There was an additional knowledge supply within the file that wasn’t getting used. This was a simple repair. Proper-click on the information supply and choose shut. In case you aren’t certain if it is getting used, no worries, Tableau will provide you with a warning earlier than it closes an information supply that’s getting used. On this case, the scale of the TWBX file dropped nearly in half.
2. There have been 1.8 million rows of information, however most of this knowledge was not getting used within the visualization. Essentially the most granular stage of element on this viz is the dot plot. This dot plot reveals the highest 10 names from 1990 to 2014 for each girls and boys. That implies that we have now 1.8 million rows of information to indicate 115 years (1900-2014) * 10 (for the highest 10) * 2 (boy/lady). That is a ton of additional rows to indicate 2,300 knowledge factors. One resolution for that is to trim the information down to what’s actually wanted.
3. There have been a ton of calculations that had been being accomplished alongside the way in which. First to sum the depend of every title, then one other calculation to rank that sum of the depend. Then one set of calculations for boy/lady measurement and one other set for form. These 4 calculations have a fancy if-then-elseif-then-elseif-then-else-end construction, and sometimes with an OR assertion included. As well as, the output of those calculations was a string, for instance, « Size1 » and « Size2 ».
Resolving these Points
The very best factor I’ve learn on Tableau Efficiency is this whitepaper written by Alan Eldridge. This whitepaper is mainly a mini-book on the topic and it is an unbelievable useful resource overlaying the broad vary of subjects needed to enhance effectivity in Tableau workbooks. Sure, it is 88 pages, but it surely’s a should learn in case you are attempting to make your workbooks extra environment friendly.
Listed below are the abstract factors from Alan’s whitepaper:
There isn’t any silver bullet for inefficient workbooks. Begin by trying on the efficiency recorder to know the place the time goes. Lengthy-running queries? A lot of queries? Sluggish calculations? Advanced rendering? Use this perception to focus your efforts in the suitable route.
The suggestions on this doc are simply that – suggestions. Whereas they characterize a stage of finest apply, you should check if they may enhance efficiency in your particular case. Lots of them will be depending on construction of your knowledge, and the information supply you might be utilizing (e.g. flat file vs. RDBMS vs. knowledge extract).
Extracts are a fast and simple option to make most workbooks run sooner.
The cleaner your knowledge is and the higher it matches the construction of your questions (i.e. the much less preparation and manipulation required), the sooner your workbooks will run.
Nearly all of sluggish dashboards are brought on by poor design – specifically, too many charts on a single dashboard, or attempting to indicate an excessive amount of knowledge without delay. Hold it easy. Permit your customers to incrementally drill right down to particulars, fairly than attempting to indicate all the pieces then filter.
Work with the information you want and no extra – each when it comes to the fields you reference in addition to the granularity of the data you come back. It permits Tableau to generate fewer, higher, sooner queries and reduces the quantity of information that must be moved from the information supply to Tableau’s engine. It additionally reduces the scale of your workbooks so they’re simpler to share
and open sooner.
Whereas lowering the information, be sure you use filters effectively.
Strings and dates are sluggish, numbers and booleans are quick.
Let’s apply a number of of those strategies to Kasia’s workbook and see if we are able to enhance the pace of this visualization.
Work with the information you want and no extra
Earlier than continuing, let’s reference Tableau’s Order of Operation.
Supply: https://onlinehelp.tableau.com/current/pro/desktop/en-us/order_of_operations.html
Listed below are Kasia’s measures on the Columns and Rows and the filters she utilized.
The Rank is rating the sum of Gross sales after the 12 months and Gender filters are utilized. Sadly, these filters solely scale back the variety of data from 1.8 million right down to 1,052,480. Then a sum of depend is carried out, as a result of that’s wanted for the rank of that sum. Solely then can a filter be utilized to that rank. In different phrases, the calculation for sum and rank needed to be accomplished on a million data that stay after the dimension filters. This is not needed, as a result of we solely want a really small variety of these rows to create the view.
Alan wrote, « Work with the information you want and no extra. » That’s nice recommendation. This workbook has 1.8 million rows of information, however in essentially the most granular view it solely wants 2,300 rows of information. In a super world, we might trim it down to simply the rows which can be wanted and use that knowledge as a substitute of the total knowledge set. As an alternative, I’ll apply a fast and simple resolution to trim this knowledge down; Knowledge Supply Filters.
Discover in Tableau’s Order of Operation {that a} knowledge supply filter (and Extract filters) are utilized earlier than different filter varieties and lengthy earlier than different calculations happen in Tableau. Leveraging this may actually pace up efficiency on this workbook. There have been two fast and simple knowledge supply filters that I utilized to Kasia’s viz.
1. 12 months – the information set begins in 1880, however Kasia is simply utilizing knowledge from 1900 to 2014. Filter the data which can be « a minimum of 1900 » removes 56,000 data that aren’t used within the evaluation.
2. Rely – that is the vital one. Each title, boy and lady, has a depend inside every year. This depend is used to find out the highest 10 names in every year utilizing the rank operate. The bottom depend that’s getting used was 1,906. By including an information supply filter at 1,906, we are able to take away 1.8 million rows of information that aren’t getting used within the viz.
Making use of these two knowledge supply filters takes the information set from 1,825,433 rows right down to 24,130. That is nonetheless 20,000 data greater than we want, but it surely’s a really fast and simple option to filter knowledge that isn’t wanted, which is able to pace up all the underlying calculation. This single step will increase the pace of the workbook considerably. Actually, this alone makes the viz way more usable on Tableau Public.
Strings/Dates vs. Numbers/Boolean
The following actually useful tip from Alan that would assist enhance this workbook’s efficiency is that this one; « Strings and dates are sluggish, numbers and booleans are quick. »
Here is an unrelated instance. As an alternative of utilizing an IF assertion to assign a spotlight colour as a string, we are able to use a boolean output.
Spotlight a Colour as a String:
IF [State] = [State Parameter] then « Blue »
ELSE « Grey »
END
Spotlight a Colour as a Boolean:
[State] = [State Parameter]
Discover the boolean is a way more elegant resolution on this case and it’ll carry out higher on bigger knowledge units.
Kasia’s workbook had a number of calculations with complicated IF statements that output to a string (and calculating them on a million data). We won’t use a boolean resolution for Kasia’s calculation, nonetheless, we are able to make these calculations sooner by changing them to numbers.
Kasia’s authentic calculation for Measurement as a String:
if [Circles – Boys]=0 then « Size1 »
ELSEIF [Circles – Boys]=1 OR [Circles – Boys]=2 OR [Circles – Boys]=4 then « Size2 »
ELSEIF [Circles – Boys]=3 then « Size3 »
else « Size4 » finish
Here is a revised calculation to get the identical outcomes, however as a substitute of a string it outputs a quantity:
case [Circles – Boys]
when 0 then 1
when 1 then 2
when 2 then 2
when 4 then 2
when 3 then 3
else 4
END
Observe – An alternative choice on this case could be to group them, for instance 1, 2 and 4 grouped as 2, and use the group on measurement, with out utilizing a calculation.
In the long run, the adjustments had been fairly straight ahead. Kasia was capable of make a number of minor adjustments, lowering the information to what she wants within the viz and updating a number of calculations, and the efficiency of the workbook elevated considerably. Utilizing Tableau’s Efficiency Recorder in Tableau Desktop (Assist menu -> Settings and Efficiency -> Begin Efficiency Recording) we are able to see an enormous distinction within the efficiency of this workbook with these adjustments.
Unique Viz: 39.57 seconds to open the workbook and 23.15 seconds computing desk calculations
Up to date Viz: 2.075 seconds to open the workbook
I hope you discover this info helpful. In case you have any questions be at liberty to electronic mail me at Jeff@DataPlusScience.com
Jeffrey A. Shaffer
Observe on Twitter @HighVizAbility