The most common and most effective recommendations to speed up slowly running VBA code in Excel are the following four:

1. Turn off the screen updating while the code is running

2. Set Excel’s calculation mode to manual while the code is running

3. Don’t select (activate) objects (ranges, charts, shapes, etc.)

4. Optimize the way of reading values from and writing values to the worksheet. Instead of looping through a range, read or write the entire block of data in one go with one single statement.

Well, you have seen those before, right? Nothing new under the sun.

However, how much can you speed up your code with these best practices?

Well, for the screen updating and the calculation mode, it depends on how your specific workbook looks like. The more complex the display is (i.e. how much rendering Excel has to do to update the screen?) and the more and more complicated formulas you have in your workbook (i.e. how long does one recalculation take?), the higher the performance gain will be. Thus, it is hard to provide universally valid numbers on the potential performance increase.

With regards to the avoidance of Select statements and the method of transferring data between Excel and VBA, however, a more or less generic test environment is possible.

Today’s post provides a small Performance Test Tool to demonstrate how much faster your VBA code can be, if you are using the best method to read data from or write data to an Excel worksheet using VBA. As always, the article comes with the Excel workbook for free download.

We will compare 6 different approaches how to read data from and write data to a worksheet range by VBA. This section will only briefly describe the methods. If you are interested in the according VBA code, please download the Performance Test Tool (download link at the end of the post) and have a look.

**Method 1 - Cell by Cell using Select**

The first method does what you definitely should not do: it loops through the entire range row by row and column by column, selects a cell and then reads the value of this cell respectively writes a value to this cell on the worksheet. The code uses the .Select method and the .Selection object.

**Method 2 - Cell by Cell (no select and no range variables)**

The second approach gets rid of the (unnecessary) Select statement. Instead of .Select and .Selection, the code uses a Range object and its .Value property. However, the code still loops through the entire range cell by cell.

**Method 3 - Cell by Cell (range variables)**

Methods 3 is pretty much the same as method 2, but it defines and uses a range variable to address the cells on the worksheet.

**Method 4 - Array Row by Row**

The fourth variation uses a 2-dimensional VBA array to read the range row by row instead of cell by cell. The first dimension of this array has an upper bound of 1 (i.e. one row). The upper bound of the second dimension of this array is the defined number of columns. The code then reads and writes one entire row of the range instead of looping through it cell by cell.

**Method 5 - Array Column by Column**

This is the transposed version of method 4: it reads and writes the range column by column, using a 2-dimensional VBA array again, but this time with the rows count as the upper bound of the first dimension and 1 as the upper bound of the second dimension.

**Method 6 - Entire Array in One Go**

Finally, the fastest way to read/write all values from/to an Excel range is to directly assign the entire range to a 2-dimensional VBA array which dimensions equals the size of the range, i.e. the upper bound of the first dimension is the count of rows, the upper bound of the second dimension is the count of columns. Why is this approach the fastest? Because there is no looping necessary. Everything can be done in one single statement.

Let’s test the methods with different range sizes, i.e. combinations of rows and columns counts.

We start with a small range of only 10 rows and 2 columns. Here are the results:

You can already see that using a Select statement is considerably slower than the other methods. However, as long as you are only dealing with very small cell ranges, the optimization potential is next to nothing.

Now, what if we increase rows to 5,000 and columns to 5, i.e. let the code read and write 25,000 cells?

Boom! Simply deleting the totally useless Select statement already decreases the runtime of reading the range from 9.5 seconds down to 0.19 seconds and writing the values from 10.7 to 0.8 seconds. With the fastest method (#6), we even get to run times of only 3 respectively 61 **milli**seconds.

Now you see it: there is a considerable potential of performance optimization, if you use the best way to exchange data between Excel and VBA.

Finally, let’s go big: 50,000 rows and 50 columns (= 2.5 million cells):

The effect scales. As we have seen before, the main performance killer is the Select statement. However, the results also show that simply using a range variable already massively speeds up the code of reading (5.8 instead of 17.8 seconds) and writing cells (49.4 instead of 68.9 seconds).

A further huge performance gain is possible with VBA arrays. In total, you can e.g. cut down the runtime of reading 2.5 million cells from 886 seconds (= almost 15 minutes) to less than one second.

Now, this is proof that you should definitely care about how you read data from and write data to a worksheet in VBA.

Please note that these results may vary depending on hardware, software, environment, other application or background tasks running, etc.

I conducted the tests shown above with a little Performance Test Tool.

The workbook consists of three worksheets.

On the first worksheet [Performance Test Single], you can test one dedicated combination of rows and columns count:

- Define the desired count of rows and columns in cells C6 and C7
- Use the checkboxes to decide which methods shall be included in the test
- Press on the command buttons “Read Methods” and “Write Methods” to run the tests
- The results section of the dashboard shows the runtime in seconds (visualized in a one column heat map), the delta to the maximum duration in % and the difference between write and read for each method

On the second sheet [Performance Test] you can test 15 combinations of rows and columns count in one go.

- Define the rows and columns combinations in range D4:R5 (myParameters)
- Click on the Command Button “Start Calculations” to start the test
- The results (runtime in seconds) are then shown in two heat map tables

The third worksheet [Input Output] is used as the input sheet for the read methods and the output sheet for the write methods. I.e. an internal worksheet, no changes necessary.

Here is the performance test tool for free download:

Download Read and Write VBA Performance Test

(zipped Microsoft Excel 2007-2016 workbook, 136K)

Please be advised that the code uses Windows functions (QueryPerformanceCounter and QueryPerformanceFrequency) to stop the time. Therefore the code will not work with Excel for Mac.

Stay tuned.

]]>The previous posts Customizable Tooltips on Excel Charts and User-defined Excel Chart Tooltips showed how to provide insightful tooltips when hovering over a data point on an Excel chart.

The idea and its implementation can be taken one step further: instead of displaying a tooltip, hovering over or clicking on a data point of an Excel chart can send this information across your dashboard to other views (charts or cell ranges). I.e. you can select one data point out of many in one view and display detailed information on this selected data point in other views.

My other favorite data analysis and data visualization application, Tableau Software, calls this an action. This is a fantastic feature for data exploration and analysis, especially for larger data sets.

Today’s post shows how to mimic Tableau’s mouse hovering or mouse selecting actions on an Excel dashboard. As always, the Excel workbook is provided for free download.

At the risk of boring you, I stayed with the Gapminder data I already used in the Tooltip posts (Customizable Tooltips on Excel Charts and User-defined Excel Chart Tooltips) for the following reasons:

- the data suits very well to demonstrate the technique
- staying with the same data makes today’s approach directly comparable to the posts providing customizable and user-defined tooltips
- last, but not least, it was easier and more convenient for me: I had the data available and saved labor to look for and preprocess a new data source

Anyway, this and the previous posts aren’t about about the data and what it is telling us. The articles are focusing on the technique and implementation of tooltips and actions on Excel charts and dashboards. The data itself is less important. Replacing it by your own data shouldn’t be a big deal.

Before we come to the interactive features, let’s have a look at the dashboard first:

It consists of two main parts: on the left side, you see the already known XY scatter chart, visualizing fertility rate and life expectancy of more than 200 countries in the world for one selected year (which is shown directly at bottom left of the plot area). The currently selected country is highlighted in red, all countries of the same region are colored in green and the development of the current country is visualized with two trails (before and after the selected year).

On the right side of the dashboard, various detailed information on the selected country are visualized: name, region, a flag icon, the exact numbers of the measures and two band charts show the development of the measures over time (more on the power of band charts here: An Underrated Chart Type: The Band Chart).

In a nutshell: the dashboard provides the overview of all data on the left side and the details for one selected data point on the right side. A powerful way to dynamically explore a large data set plotted on an XY scatter chart.

The main interactive feature is the selection of a country by hovering over a data point with the mouse cursor. This is along the same lines as in the tooltip examples (Customizable Tooltips on Excel Charts and User-defined Excel Chart Tooltips), but instead of showing a tooltip with additional information, the details of the selected country are visualized on the right side of the dashboard.

Please be advised that the chart has to be activated / selected to make the feature work (see also the red hint above the XY scatter chart). I added a line of code to the Selection_Change event of the worksheet, i.e. the chart should be automatically activated every time the user selects a cell on the sheet. However, if – for whatever reason – the chart is not selected, the user has to click on the chart first to make the mouse action feature available.

The hovering action is the most convenient, at least from my point of view. However, some users may prefer clicking on a data point to select it, especially in complex, calculation-intense models. To provide this option, the user can choose whether he wants to select by hovering or by clicking:

At the top right of the dashboard, you can pick a year with the slider and a country with the drop down. I assume, you will mainly use the mouse driven actions to select a country. However, if you are looking for one specific country, the drop down selection will definitely be helpful:

The three checkboxes above the XY scatter chart enable the user to easily decide what shall be displayed in the view: you can turn on and off the visualization of all countries, the highlighting of the region and decide whether or not the trails of the development over time shall be displayed:

Finally at bottom right, you can define which average lines shall be plotted in the band charts: the simple average (arithmetic mean) or an average weighted by population:

Providing a detailed step-by-step tutorial how to build this dashboard would go far beyond the scope a blog post. So, unfortunately I can’t provide that, but here are at least a few hints how the model is implemented:

- The heart of the solution (the mouse actions) is done by mouse event subs of the Chart Object. Microsoft Excel MVP Jon Peltier provides the necessary VBA code for free here: Get XY on any Chart. Jon’s code handles all mouse movement and mouse click events and I simply added one line writing the index of the current data point to the worksheet (named range: myCurrentCountry). All formulas in the model then refer to this number to fetch the detailed information of the selected country from the data worksheet and visualizes them in cell ranges and charts on the dashboard.
- All other interactive features are realized with form controls. I am sure you know or can figure out how those work.
- The XY Scatter is based on 5 different data series which are consolidated on the worksheet [Calculations]. The user-defined view (all countries, regions, trails) is implemented with the usual chart data source trick: an IF clause and an NA() function. Nothing new under the sun.
- A few lines of extra VBA code and formulas make sure only appropriate parts of the chart legend are displayed.
- The charts on the right side of the dashboard are band charts, more here: An Underrated Chart Type: The Band Chart.
- Last, but not least, the flag icon is a camera object aka linked picture. It refers to a named formula called myFlag which takes a snapshot of the flag of the current country using an INDEX function.

Agreed, a very rudimentary explanation of how this is done. However, the workbook has no password protection, neither the worksheets nor the VBA code. So, if you are interested in the details, please download the workbook and dissect it for yourself. If you have any questions, please drop me a line in the comment section below.

Here is the link to the workbook described above for free download:

It was again my friend Leonid Koyfman (Leonid's LinkedIn Profile), who was kind enough to thoroughly review my workbook before publishing. Leonid found and fixed a few bugs and, even more important, provided me some great additional ideas I haven’t thought of. Many thanks for your time, Leonid. I owe you, big time.

Many thanks go also again to Microsoft Excel MVP Jon Peltier for his ingenious code. I made a ton of hay with it, so thank you very much, Jon.

Stay tuned.

]]>The previous post S-Shaped Function in Tableau discussed and provided the implementation of a customizable, i.e. user-defined S-shaped function in Tableau Software:

In the introduction to this post I wrote: *“Having a ready to use S-shaped function in Tableau is a must-have for your Advanced Analytics Toolbox.”*

This applies to Microsoft Excel, too. Maybe even more.

There are quite a few different S-shaped functions in Excel available on the Internet. The first implementation I saw was a post by Juan C. Mendez back in 2007. Unfortunately Juan’s site is down now, but an archived version is still available here. Others built upon Juan’s approach, like Mathias over at Clear Lines Consulting and this site is still up.

Juan and Mathias used a different formula than I do (in Tableau and in Excel), but the result is pretty much the same. The only new feature in my implementation is the option to highlight the period of fast growth of the S-shaped function with a reference band.

So, nothing really new under the sun in this post, but - for the sake of completeness - here is my Microsoft Excel template of an S-shaped function for free download:

Download S-shaped Function (Microsoft Excel 2007-2016 workbook, 26K)

If you are interested in how to use the parameters to shape the curve, please refer to the previous post S-Shaped Function in Tableau.

More things to come soon.

Stay tuned.

]]>The speed of adoption of an innovation usually follows an S-shaped function. Such a function shows an initial exponential growth until the inflection point, followed by an exponential decay until it converges to an upper target value. In other words, a curve looking like an S-shaped pattern.

Typical examples for business measures following an S-shaped function are adoption rates of new products, market shares of new companies, the effects of new processes or new organizational practices and the like.

Providing a customizable (i.e. user-defined) S-shaped function in a Tableau workbook can be very helpful across all stages of advanced analytics: in descriptive and diagnostic analyses, the S-shaped function can be used as a customizable reference line for the actual historical development and for analyzing deviations. In predictive and prescriptive analytics, the S-shaped function can be used to model and forecast expected future developments and to perform sensitivity and what-if analyses.

Having a ready to use S-shaped function in Tableau is a must-have for your Advanced Analytics Toolbox.

Today’s post provides a Tableau workbook template with a fully flexible user-defined S-shaped function. The workbook is published on Tableau Public and also available for free download.

The only source data you need for an S-shaped function in Tableau is one date dimension. For my minimalistic template (see below), I used a one-column table with all dates from January 2017 to December 2021. If you want to implement an S-curve in your own Tableau workbook, simply use a date dimension of your database or blend in a list of dates similar to what I used.

An S-shaped function is defined by the date dimension and seven variables defining how the curve shall look like:

- [Period Start] defines the month when the S-shaped function starts, e.g. the launch date of the product or the market entry date of a company
- [Period End] defines the end of the period under consideration
- [Percentage Start] is the start percentage value. If you are modelling the launch of a product or company, this parameter will usually be set to 0%. The parameter and the formulas, though, also allow to start at a percentage greater than 0%
- [Percentage End] sets the final percentage value, i.e. the long-term value the S-curve will tend to
- [Start of fast growth] is the number of the month (after [Period Start]) when the value will begin to increase exponentially
- [Takeover Periods] defines the number of months the value will rapidly increase before it will slowly converge to the final value
- [Curviness Parameter] is a constant defining the slope of the curve. The higher the value, the steeper the gradient of the curve and vice versa

To give the user the option to customize the entire S-shaped curve, we need one parameter in Tableau for each of these variables, i.e. two date parameters, two float (percentage) parameters and three integer parameters.

The effect of the first four parameters ([Period Start], [Period End], [Percentage Start] and [Percentage End]) on the S-curve should be mostly self-explanatory, so let’s have a look at the other three: [Start of fast growth], [Takeover Periods] and [Curviness Parameter].

A picture says a thousand words or - as Albert Einstein said – “*Example isn't another way to teach, it is the only way to teach.*” So here are a few examples which hopefully help to understand how the parameters shape the function.

In all following examples, the period under consideration is set to January 2018 until December 2020 and the percentage value increases from 10% to 80%.

Example 1:

Let’s start with an arbitrarily chosen set of parameters. If we set [Start of fast growth] to 7, [Takeover Periods] to 18 and the [Curviness Parameter] to 500, we get the following S-curve:The rapid development starts in July 2018 (i.e. the 7th month after [Period Start], because [Start of fast growth] is set to 7) and lasts until December 2019 (i.e. for [Takeover Period] = 18 months).

Example 2:

Let’s increase [Start of fast growth] to 13 and decrease [Takeover Periods] to 12 and we get an S-curve where the rapid growth starts later but only lasts for one year:

With a further decrease of [Takeover Periods] down to 6, the S-curve looks like this:

So far, we only changed [Start of fast growth] and [Takeover Periods]. What about the [Curviness Parameter]? As mentioned above, the curviness defines the gradient of the function. Let’s get back to the first example and set the [Curviness Parameter] to a few selected values. Here is how the curviness affects the function:

Now you see it: the higher the curviness, the steeper the ascent of the function.

I hope these examples will give a first impression of how the curve can be manipulated by changing the parameters. To get a better feeling how this works, please play around with the parameters on the Tableau Public dashboards (see below) and see how the shape of the function changes.

Although it takes 7 parameters, we only need three Calculated Field for a user-defined S-shaped function:

**[Date Year and Month]**

The data I am using is on a daily basis (see section The Data above). If we want to enable the user to enter the parameters [Start of fast growth] and [Takeover Period] in months, we need a Calculated field to convert the daily data to monthly data with this formula:

DATETRUNC('month',[Date])

The Calculated Field simply truncates the day from each date, i.e. only year and month are left.

**[S-shaped calculated]**

This Calculated Field is the heart of the solution:

[Percentage Start]+

([Percentage End]-[Percentage Start])/

(1+[Curviness Parameter]^

(([Start of fast growth]+[Takeover Periods]/2-

(AVG(DATEDIFF('month',[Period Start], DATEADD('month',1,[Date Year and Month])))))/

[Takeover Periods]))

I won’t go into the detail of the mathematical background here. If you are interested, check out the Wikipedia page on the Logistic Function or the Wolfram article on the Sigmoid Function as a starting point.

Please note that the Calculated Field uses the truncated date. All days of the same month thereby get the same percentage value. You can also use [Date] from the data source instead of [Date Year and Month] in the formulas, but you would then have to enter [Start of fast growth] and [Takeover Period] in days instead of months. I found that a bit cumbersome, that’s why I went with the monthly data.

**[Valid Date]**

[Valid Date] is finally a simple Boolean Calculated Field used on the filter shelf to restrict the visualized data to the defined time period.

[Date Year and Month]>=[Period Start] AND [Date Year and Month]<=[Period End]

Now, enough storyboarding. Let’s shoot something: here is the workbook with an S-shaped function on Tableau Public:

The workbook contains four variations of the S-shaped function (please note the four tabs at the top of the visualization):

- The first dashboard uses the Calculated Field described above on the Rows Shelf and YEAR ([Date]) and MONTH([Date]) on the Columns Shelf
- The second dashboard uses only MONTH([Date]) on the Columns Shelf, converted to continuous. This way, you get to a solid line of the function without the gaps between the panes (years) as in the first dashboard
- Dashboard 3 adds a Reference Band to visualize the takeover period. Two additional Calculated Fields are necessary for this, which are used as “Band From” and “Band To” in the Reference Line definition dialogue:

- [Start Fast Growth]

DATEADD('month',[Start of fast growth]-1,[Period Start]) - [End Fast Growth ]

DATEADD('month', [Takeover Periods]+[Start of fast growth]-2,[Period Start]

- [Start Fast Growth]
- Finally, dashboard 4 is a completely different approach (leading to the same results): instead of the Calculated Fields described above, two alternative calculations are used, which are based on a RANK calculation. I won’t go into the details here. If you are interested, please download the workbook and have a look for yourself

Stay tuned.

]]>In the past few months (on and off), I have been working on a blog post providing 50+ time formulas in Tableau, similar to the article String Calculations in Tableau published in 2014.

As you can imagine, developing and especially describing 50 Calculated Fields is a lot of work and that’s why this is still in progress. But I am close.

Anyway, during the development I had the idea of providing Calculated Fields for the time of sunrise and sunset in any location at a selected day. It turned out that this is way more complicated than it may sound and I couldn’t get my head around the math. Luckily, I finally stumbled across a brilliant Excel workbook provided by the Earth System Research Laboratory with all necessary calculations and formulas. Transferring the Excel formulas to Tableau Calculated Fields was only a diligent, but routine piece of work.

Today’s short post provides a Tableau Public workbook visualizing sunrise, sunset, solar noon and daylight hours for the ~ 8,800 largest cities in the world at any given day. If you need the Calculated Fields for your own workbook, you can download the workbook from the Tableau Public dashboard published below.

The underlying data source is a simple table of all cities in the world with more than 50,000 inhabitants (8,819 cities). The database includes the name of the city, the country, latitude, longitude, population and finally the time zone respectively the time difference to GMT in hours:

Data source: geonames.org

I won’t go into the details here how the math is working. Mainly because I do not understand it myself. I can see that a lot of trigonometric functions are used in combination with some cryptic constant numbers, but - truth be told - I do not have a clue how all these calculations work in detail. As mentioned in the introduction, my tiny contribution to this is simply the transfer from Excel to Tableau. If you are interested in the details, please refer to the Earth System Research Laboratory website.

The calculation flow uses 15 (!) intermediate formulas to come to the main results. For your and my convenience, I named these intermediate formulas [SC_01_Julian_Day] to [SC_15_HA_Sunrise_(deg)] and the main results (the ones used in the visualizations) [Sunrise], [Sunset], [Solar Noon] and [Daylight Hours]. This makes it easier to separate the intermediate calculations from the more important results.

According to the Earth System Research Laboratory, the calculations are not 100% accurate:

*“The calculations in the NOAA Sunrise/Sunset and Solar Position Calculators are based on equations from Astronomical Algorithms, by Jean Meeus. The sunrise and sunset results are theoretically accurate to within a minute for locations between +/- 72° latitude, and within 10 minutes outside of those latitudes. However, due to variations in atmospheric composition, temperature, pressure and conditions, observed values may vary from calculations.”*

Furthermore, there is a limitation regarding the selected day:

*“Please note that calculations […] are only valid for dates between 1901 and 2099, due to an approximation used in the Julian Day calculation.”*

So, the results are not 100% accurate (error range 1 to 10 minutes) and only valid for the 20th and 21st century, but this is good enough in my book.

Without further ado, here is the small Tableau workbook providing sunrise, solar noon, sunset and daylight hours on a user-selected day for all cities in the world with more than 50,000 inhabitants:

The interactive features of this dashboard:

- pick a date at the top
- if desired, filter the list of cities with the at-least-population slider
- hover over the map to see the details in the tooltips
- click on the map to select a city and filter the table
- click on a data row in the table to highlight the city on the map

As already mentioned in the introduction: if you are interested in the Calculated Fields, please download the workbook from the Tableau Public visualization above and have a look.

Please be advised that the solar calculations do not reflect potential daylight saving times (DST). It goes without saying that it is possible to calculate start and end date of a certain DST rule in a given year with Calculated Fields in Tableau. I will demonstrate how this works in the Time Calculations Collection I mentioned in the introduction.

However, for the data set up used in this article, considering DST is a real challenge for two reasons:

- DST are not everywhere consistent across a country. E.g. in the US, in Australia and in Brazil, DST are only observed in some states/regions. Thus, you can’t deduce the DST rule from the country, but rather need the information if a DST is observed in the specific city for the entire list of cities used above. I did not have this data available.
- Even more complex is the time dimension: since the workbook allows to pick any given date between 1901 and 2099, you would need the information about which country observed DST when. For instance, Netherlands observed DST from 1916 to 1945 and since 1977, whereas Germany observed DST 1916-1918, 1940-1949, and since 1980.

In a nutshell: you would need much more data (and a couple of additional Calculated Fields) to correctly reflect DST. That’s why potential DST are not included in the results of the Tableau Public workbook above.

Finally, I am not sure if there is a use case for solar calculations in Tableau, especially in a corporate environment, but you never know…

Stay tuned.

]]>