*869 words, ~4 minutes read*

Terms like Artificial Intelligence, Machine Learning, Deep Learning and (Artificial) Neural Networks are all over the place nowadays.

If you are reading Tech News, Data Science blogs or your LinkedIn feed, it will be little short of a miracle, if you don’t see one of those expressions at least once.

This is just the revival of those techniques, though. Neural Networks, for one, have been around for many years. In the mid 1990s (!), I did some research and wrote my thesis about Artificial Neural Networks. We even had a blog post here on this topic 10 years ago: Where the rubber meets the road. For whatever reason, that article didn’t make many friends. I was always wondering why. Probably too academic and not visually appealing enough.

Now, with the recent revival of Artificial Intelligence and Neural Networks, I decided to give it another shot. Today’s post provides an updated, improved version of a Neural Network solving Travelling Salesman Problems in Microsoft Excel.

You always wanted to watch a Neural Network solving an optimization problem? If so, this article is for you. Either watch one of three videos provided in the post or download the Excel workbook and play around with it at your own speed. No add-in or third party software necessary. All you have to do is to enable macros.

The Neural Network provided today is exactly the same as in the original post (Where the rubber meets the road). It is based on the article “an analogue approach to the travelling salesman problem using an elastic net method”, by Richard Durbin and David Willshaw published in Nature back in 1987.

Without going into the details of the mathematics, here is the basic idea of this approach:

Many other algorithms for solving TSPs start with one feasible solution and try to switch to better ones from iteration to iteration (i.e. shorter total tour length). The approach of Durbin and Willshaw is different: they don’t start with a feasible solution. They start with an elastic net (or adaptive ring), i.e. a circle of nodes where the number of nodes is a multiple of the number of cities of the TSP. This circle is gradually deformed until it finally passes through all cities and describes a feasible and short tour.

You can think of this ring as if it would be a rubber band. In each iteration, the Neural Network randomly selects a city, finds the nearest node on the ring and pulls the node and its neighbours towards this city. If you visualize this behaviour of the algorithm, it looks as if this rubber band is deformed and elongated until a feasible solution is found. The tension of the rubber band guarantees that dragging the rubber band complies with the objective function of the TSP, i.e. to minimize the total tour length.

As already stated above, the used algorithm is exactly the same as in the original post. However, I made quite a few changes in the workbook to make the handling of the model easier and the views more interesting:

The most obvious change is the map. Instead of showing generic points in the Euclidean space, today’s workbook solves Travelling Salesman Problems through cities of the United States and provides the geospatial context with a US map in the background.

Furthermore, I simplified the dashboard:

Only the most important user-selections are still available on the dashboard and only a selection of results is shown.

With that said, I did not only simplify, I also added a few helpful features:

- With the play buttons, you can start, pause, resume and stop the algorithm at any time
- With the two checkboxes, you can decide whether or not all cities of the US shall be shown on the map (i.e. not only the ones of the TSP) and whether or not the names of the TSP cities shall be displayed

At the bottom right of the dashboard, four charts visualize some key metrics of the algorithm during execution:

- a column chart shows the distribution of the cities, i.e. how often the cities are randomly selected and processed during the algorithm
- the three line charts visualize the development of the length of the adaptive ring, the decreasing learn rate and the size of the radius, i.e. the decreasing number of nodes being moved on the ring

Now, let’s see this in action:

Here is a 10 cities TSP. The algorithm is slowed down intentionally by waiting 200 milliseconds after each step to better understand how the Neural Network works:

The next video shows how a 20 cities TSP is solved. No waiting time, but still updating the screen after each iteration:

Finally, a 100 cities TSP. The views on the dashboard are updated every 50^{th} iteration:

Not only interested in watching the videos, but also in playing around with the model or even dissecting the code? Here you go:

Download Neural Network Solving TSP (zipped Excel workbook, 663K)

Stay tuned.

]]>*959 words, ~5 minutes read*

*Update on Monday, August 3, 2020: The described issue is solved, thanks to Andy Pope. Please refer to the update at the end of the article or to Andy's message in the comment section. Many thanks, Andy.*

The recent articles Dynamic Storyboards in Excel and A practical Example for Dynamic Storyboards described a way how to plot several data series in one XY Scatter Chart by offsetting the data points.

During the implementation of the workbooks coming with these articles, I stumbled across a weird Excel oddity:

If you are trying to transfer the offset calculations into Named Formulas and use these Named Formulas as the source of you chart series, Excel does not always plot the data correctly.

Please be advised that today’s post is only a description of the issue. I have no clue why this is happening and therefore I do not have an explanation of what is going on, let alone a solution.

I have a question for you, though: have you ever encountered this problem? And if so, do you know what the root cause might be and how to overcome the problem? If so, please let me know in the comment section of the post.

In order to facilitate the explanations in this post, I will use a pointless, but more illustrative and easy-to-follow data set: I created coordinates to show the digits 1 to 9 when plotted in an XY Scatter Chart:

Each digit has coordinates between 1 and 45 on the horizontal and the vertical axis.

If you plot the data as it is, all 9 digits will be displayed in the same range sitting on top of each other. But this is not what we want. What we want to see is a display showing 1 at top left and 9 at bottom right, like this:

The challenge is exactly the same as in Dynamic Storyboards in Excel.

To get there, we need to add appropriate offsets to the coordinates according to the image number. It is a pretty simple calculation: for image 1, for instance, there is no offset added to the horizontal coordinate [X], but an offset of 90 is added to the vertical coordinate [Y]. Image 3 gets a horizontal offset of 90 and a vertical offset of 90. image 7 gets no offset, neither horizontally nor vertically. You get the idea.

First approach is to add calculated columns to the data table like this:

Let’s start with the usual approach: directly link the data series to the calculated columns in the table:

This works as expected and we get this chart:

The offset calculations work and the digits are displayed in correct order from top left to bottom right.

Now, since we need the calculations for this chart only, we might want to get rid of the calculated columns in the data table by replacing them with Named Formulas.

Let’s start with simple Named Formulas, myXPlotSimple and myYPlotSimple, to replace the columns [XPlot] and [YPlot] in the data table.

Here are the definitions in the Name Manager:

Just like in the calculated columns of the table, the values in column [XOffset] are added to the values in column [X] and [YOffset] is added to [Y].

Next step is to replace the data ranges in the data source of the chart series:

Next, we also transfer the calculated columns [X Offset] and [Y Offset] into Named Formulas called myXPlotFull and myYPlotFull as shown here:

The formula refers to the value in the column [X] of the data and adds the result of exactly the same INDEX-function we used in the calculated column [X Offset] of the table. The results of the formula should be equal to the results in column [X Plot]. We do the same for myYPlotFull.

Once again, we use these Named Ranges as the source of the chart data series:

Surprisingly enough, this does not work anymore:

All data points are plotted, but somehow all calculated offsets seem to be not working correctly and all 9 digits are shown at top left of the chart stacked on top of each other.

The interesting thing about this oddity is the fact that all images get the offset of the first image (XOffset = 0, YOffset = 90). If you sort the data table by [Image] descending (i.e. image 9 is the first image in the data table), all points are plotted at bottom right of the table.

So, what is the problem here? To find out, I inserted a check sheet:

On this simple worksheet, the results calculated in the data table ([XPlot] and [YPlot]) are compared to the multi-cell array-entered Named Formulas myXPlot Full and my myYPlotFull.

As you can see in the check columns right, there is no difference, i.e. the Named Formulas return exactly the same results as the calculations in the table.

If we use the ranges where we just array-entered the Named Formulas as the source of the chart series, the chart works perfectly fine. But if you use the Named Formulas, it doesn't.

Excel calculates the values accurately, but it does not plot them correctly.

I am not sure if anyone will be interested, but just in case, here is the workbook for free download:

Download Named Formula Data Source Oddity (zipped Excel workbook, 731K)

Any ideas why this is happening? Have you encountered the same or a similar problem before? Or did I miss something and there is an error in my approach?

Let me know what you think in the comment section below.

Stay tuned.

I published a link to this post on LinkedIn and reached out to the real experts on Excel charts. Wouldn’t you know it. The brilliant Andy Pope provided a perfect explanation and solution within a couple of minutes (see also the comment section):

*"The INDEX function does not return an array, so only the single values for XOFFSET (Calculation!E13) and YOFFSET (Calculation!F13) are used. If you change those values all the plots will have the relevant XY start point. To get the INDEX to return an array use this trick:*

*myXPlotFull: =tab_data[X]+INDEX(Calculation!$E$13:$E$21,N(IF({1},tab_data[Image])))*

*myYPlotFull: =tab_data[Y]+INDEX(Calculation!$F$13:$F$21,N(IF({1},tab_data[Image])))"*

Many thanks Andy, this is very impressive.

]]>*297 words, ~1.5 minutes read*

The previous article showed an an Alternative to the Bar Chart Race in Microsoft Excel.

As an example, the workbook visualized the results of the English Premier League of this season (2019/2020) in an animated chart.

A few people apparently liked the visualization. I received a couple of emails and LinkedIn messages asking, if I could also provide the same workbook for other football leagues.

No sweat.

Here are the according Excel workbooks for Italy’s Serie A, Spain’s La Liga and the German Bundesliga:

Download Animated Serie A 2019/2020 (zipped Excel workbook, 1.4MB)

Download Animated La Liga 2019/2020 (zipped Excel workbook, 2.6MB)

Download Animated Bundesliga 2019/2020 (zipped Excel workbook, 1.6MB)

And for the sake of completeness, here is the link to the Premier League version again:

Download Animated Premier League 2019/2020 (zipped Excel workbook, 2.8MB)

Please note that

- Except for the Bundesliga, all leagues are still ongoing. The data in the workbook contains only the results up to July 9, 2020. If you want to have a full view of the season, you will have to complete the results on the worksheet [Data] during the upcoming weeks. The fixtures of the remaining matchdays are there already, but the results are missing
- The workbook for Italy’s Serie A contains an inaccuracy. Just like in the other three leagues, teams tied in points are ranked by Goal Difference first and Goals scored second. This is not correct for Italy: in Serie A, the tie breaker are the matches of these two teams (head-to-head records). Thus, the final table shown in the workbook might not be the official / correct one

Stay tuned.

]]>*1,276 words, ~ 6.5 minutes read*

The other day, I watched an as interesting as entertaining dispute between Andy Kirk and Andy Cotgreave about the popular Bar Chart Races: Data Viz Debate: Is the bar chart race more than just a gimmick?

Although I truly admire Andy Kirk’s work, I am with Andy Cotgreave, here. I wouldn’t go so far to call them the “fidget spinners of data visualization”, but I find them hard to follow, especially when the speed of the animation is too fast.

Last week, Philip Treacy of MyOnlineTrainingHub published a post and a workbook, how to do a Bar Chart Race in Microsoft Excel. Philip visualized the 2019/2020 Premier League results in an animated Bar Chart. I am a big fan of Mynda and Philip’s MyOnlineTrainingHub, which is definitely one of the best Excel websites worldwide.

This made me think. Although I love Philip’s work, I am not a fan of the Bar Chart Race. So, how would I approach this? Today’s post provides my solution of an alternative for animated Bar Charts. As always, the post comes with the Excel workbook for free download.

Here is the initial spark for today’s article: Philip Treacy’s classic example of a Bar Chart Race in Microsoft Excel:

Philip’s approach is very smart: a Pivot Table to aggregate the data of the current week, Excel 365’s SORT function to sort the table, re-colouring the bars, assigning the team logos to the data labels and managing the animation by VBA.

Great work. Kudos, Philip.

That being said, it is a Bar Chart Race and, as mentioned in the introduction, I am not the biggest fan of Bar Chart Races. So, here is my suggestion, how to visualize this data with an animation.

Have a look at the result first:

More views and probably not as easy to understand and digest as Philip’s straight forward solution.

Let me explain the changes and enhancements I made and why I made them.

Unlike Philip, I am not using a bar chart, but an XY Scatter Chart to animate the race. The main difference, however, is the fact that the teams are not sorted descending by points. Instead, they are shown in alphabetical order. This avoids the “jumping up and down” of the teams during the animation and – in my humble opinion - makes the chart less busy and the race itself easier to follow:

It is actually not a new idea. I have used this already in Tableau visualizations of football (for my readers in the US: soccer) data:

Putting data into context is extremely important. It helps the audience to better understand the data and to gain more insights.

Firstly, I added the information about which teams would be qualified for the UEFA Champions League (the top four) and which would be relegated (the bottom three). In the race chart, this is done in two ways: the background color of the axis labels (the team names) and two vertical lines in the plot area. Red highlights the teams currently in relegation, green shows the ones qualified for the Champions League (see the screenshot above).

Next, additional views are added right to the race chart. Nothing to write home about, just the usual suspects: the table and the fixtures with the results.

During the animation, you will probably not gain much insights from the table and definitely nothing from the fixtures and results.

That being said, these views are extremely helpful when you stop/pause the animation (and you can do this whenever you want). After stopping the animation, these views provide all the details you may be interested in for the currently selected match week. Also, with the table, you do not need the descending sort which makes the Bar Chart Race so busy.

If you are looking at sports results, it is kind of likely that you are interested in one team (your favourite team). My approach takes this idea into account. You can select one team and it will be highlighted across all views with a light blue background color.

Selecting a team is possible in different ways: you can use the drop down at top right, click into the axis labels of the race chart, click on any given row of the table or on either a home or an away team in the results section.

Please be advised that you can even do this while the animation is running.

Using the play buttons at top left of the dashboard, you can start and stop the animation, go forward and backwards (delayed or fast) or jump to the start or the end of the season.

With the loop checkbox, you can decide whether the animation shall restart at week 1 after reaching the last match week and with “Delay in milliseconds” you have control how fast (or slow) the animation shall be.

Finally, you can select any given match week by clicking on one of the rectangles above the race chart. This will stop the animation, but you can restart it from there in both directions using the play buttons.

I have to admit: nothing really new under the sun. Just a new combination of existing tricks and techniques. I won’t got into the details of the implementation, because everything has already been used and explained in previous posts here.

If you want to understand how this is done, please download the workbook and have a look for yourself. If you have any questions, please drop me a line in the comment section.

I see two main weaknesses of this solution:

- The workbook uses a lot of different techniques and tricks and some VBA code. Quite some manual work is necessary to set it up, e.g. the assignment of the team logos to the data points, the rectangles to select a match week, the linked images in the table, etc. Agreed, it is a lot of work, but I think the result is worth the effort
- Excel needs some time to update a linked image and the screen. You would usually not notice this, but if you do the animation in fast forward or fast backward, the VBA code is faster than Excel. Thus, there is a delay in the updates of the team logos in the table, i.e. during the animation some team logos seem to be wrong. This is not a bug, it is just due to the delayed update of the linked images. Not perfect, but I think it is a minor issue. If you don’t agree, you can easily delete the team logos in the table, since they are only a nifty little extra feature which is not really necessary

If you are interested in having a look, here is the Excel workbook for free download:

Download Animated Premier League 2019/2020 (zipped Excel workbook, 2.8MB)

Please note that this post is published on July 7, 2020, i.e. the data contains only the results up to the 33rd match week. The remaining fixtures are already included, but the results are missing. If you want to bring this home for the entire season, you would have to update the results of the upcoming next weeks.

Please let me know what you think of this alternative to a Bar Chart Race.

And also, please stay tuned.

]]>*717 words, ~3.5 minutes read*

The previous post Dynamic Storyboards in Excel showed how to create small multiples (or panel or trellis charts) for 2-dimensional data in a sequence and let the user decide dynamically, how many small multiples shall be displayed. To describe the technique, the article used a kind of funny, but pointless data set: a part of an episode of the Italian cartoon series La Linea.

Well, creating a dynamically sized comic strip in Excel is a funny little example. But what’s the point here?

Here's the point: a Dynamic Storyboard may well be worth considering for a more serious dashboard, too. For instance, a Dynamic Storyboard may support a chart which displays the data for the periods one by one in an animation. The Storyboard can be very useful by showing the data in the context of a user-defined number of years before and after the current period. This can help to better understand the development of the data over time or to discover patterns.

Today’s post will show how a Dynamic Storyboard may support an animated chart using the example of the Gapminder Replica in Microsoft Excel, published here back in 2014. As usual, the article comes with the workbook for free download.

Back in 2014, I published a Gapminder Replica in Microsoft Excel. The data and the dashboard are a great example to show how a Dynamic Storyboard may support the animated chart.

The enhanced version of the workbook uses the same technique for creating the dynamic storyboard as described in the previous post.

There are only 2 differences:

- The start year (top left of the storyboard) is not manually selected by the user, but controlled by the animation
- The currently selected year, i.e. the one shown in the large Bubble Chart at top left, is highlighted with a red fill color of the bubbles in the storyboard.

There are only a few minor changes necessary. The basic technique and formulas for creating the storyboard are exactly as described and used in the previous post.

Here is what you get:

If you haven't seen the Gapminder Replica in Microsoft Excel before, I recommend to check out the other functionalities of this dashboard:

- With the play buttons at top right, you can start and stop the animation, move forward or backwards (delayed or fast), go to the first or last year. You can also choose whether you want the animation to loop with the checkbox, i.e. start at the beginning after reaching the last year in the data
- The rectangles above the chart visualize the time line and the current year (dark rectangle) and let you select one specific year by simply clicking on any given rectangle. This will stop the animation, but you can then start it again from this selected year
- The highlight section beneath the play buttons lets you define 3 countries to be highlighted in different colors and you can use the “Show highlighted countries only” checkbox to only display these 3 countries in the chart and the storyboard
- You can also decide with the checkboxes right to the countries whether or not a trail of the development over the years shall be visualized (in the main chart, not in the storyboard, of course)
- The table right to the drop down lists shows the measures of the highlighted countries and visualizes the maximum value of fertility rate and life expectancy with a blue fill color
- In the “Filter by region” section you can select which continents shall be displayed (in the main chart and the storyboard)
- Like in the highlight section, the table shows the metrics (weighted averages of fertility rate and life expectancy, sum of population). Again, the maximum values are highlighted by a blue fill color
- Finally, you can choose the size of the storyboard (count of rows and columns) using the spinner right above the view

If you are interested, here is the workbook for free download:

Download Gapminder Replica w/ Storyboard (zipped Microsoft Excel Workbook, 1.8MB)

Stay tuned.

]]>