Peak time for sports and leisure #dataviz. About time for a joyplot; might do a write-up on them. #rstats code at https://t.co/Q2AgW068Wa pic.twitter.com/SVT6pkB2hB

— Henrik Lindberg (@hnrklndbrg) July 8, 2017

First let me share the final outcome.

Here is the final overlapped area chart with a bit of formatting thrown in. It is a pretty close imitation of Henrik’s original chart. Click on it to enlarge.

As you can guess, the chart is a ** just an overlapped area chart** (ie each area sits behind another, unlike

Let’s start with a look at data. Henrik’s original data has 10,656 rows, each row containing activity name, time and p value – how much survey respondents enjoyed [@activity] at that time.

Here is a snapshot of first few rows.

As you can see, while this format is excellent for storing, it is very tedious if we want to make one chart with all series. So let’s scrub.

- We need to figure out if an activity should be included or not. I am using the same criteria as Henrik’s. Exclude activities with p value less than 0.003 or activity title “Playing sports n.e.c. *” (not elsewhere classified)
- To do this, we first pivot the data on activity and max(p). Then filter this pivot two ways – max(p) >=0.003 and label not equal
*Playing sports n.e.c. **

Tip: You may need to enable multiple filters per field in the field settings of row labels. - We will end up with
*28 activities*. - Then add a helper column to original table that looks up the pivot and tells if an activity should be included or not

- To do this, we first pivot the data on activity and max(p). Then filter this pivot two ways – max(p) >=0.003 and label not equal
- Add two more columns to original table to tell peak time and modified time. This will help us in rearranging and sorting the data. Modified time just moves time by 3 hours (Henrik’s chart is plotted from 3AM to 3AM). At this stage our data looks like this:

- Now, pivot the data once again. This time,
- exclude activities by using report filter on include? column.
- Set up peak and activity in row labels area, modified time in column labels area and p in values area.
- Arrange the report in tabular format, turn off sub-totals.
- We get this:

- Calculate normalized values by dividing each p value with maximum p value for that activity. We can use another range of 28×288 cells to do this. We get this:

It will become clear once you look at the charts.*The next 2 steps may seem confusing.*- Define an offset value. Start with 0.5. You can change this later. In a separate 28×288 cell range, calculate gaps by multiplying offset with position of an activity. Something like this:

- Now, finally calculate activity + gap values by adding up respective cells in each of the 28×288 ranges. We get this:

**At this stage, our data is a shape ready for visualizing.**

The chart creation process has 5 steps.

- Select the 28×288 range of cells created in step 7 and insert an overlapped area chart.
- Now, copy the gaps range (created in step 6 above) and paste them on to area chart as new series (just ctrl+c your data and select the chart, press ctrl+v)
- Adjust the order of series so that each activity is sandwiched by appropriately named gap series
- Tip: adjusting 56 series is painful with the chart select data > move series up/down buttons. Instead, just select the series, look at formula bar. The SERIES formula has last parameter as order. Change this number. It is easy to figure out the number once you try a few.

- Change all gap series fill color to white. This instantly creates the floating area chart effect.
- Change the colors of activity series. Apply white / off-white border to these series. Your joyplot is ready.

**Quick overview of the chart creation process:**

Let’s examine the result of each those 5 steps with a smaller dataset so you can see how everything fits together. Here is the data for this example:

- Create an overlapped area chart with activity+gaps data. We get this:

- Add gaps as new series to chart. You get this:

- Move the gap series so that they sandwich activity series. Use Chart Data > Move series up/down buttons or SERIES formula

- Apply white color fill formatting for gap series. This creates floating area chart effect as below:

- Finally, format the chart by apply some colors and border formatting etc.

So there you go. The final outcome does look joyful.

While joyplot is awesome, it is not easy to make. Fortunately, there are a few simpler alternatives that we can whip up in Excel as soon as you have either the pivot or normalized values. Below I have shown two such examples. Read about sparklines or conditional formatting heatmaps for more.

**Joyplot alternative – using sparklines:**

Tip: to get axis on your sparkline, just type the times separated by a single space. Then go to format cell (ctrl+1) and set horizontal alignment to distributed. Viola, Excel will fill the cell by adjusting spaces.

**Joyplot alternative – Conditional Formatting Heatmap **

**Click here to download Joyplot Excel workbook**. Examine the data scrubbing formulas, pivot and chart settings to learn how this is created.

If you are familiar with R, then go thru Henrik’s R code. It is much shorter than the Excel gymnastics we did with circular pivot table referencing. That said, some of the data re-arrangement could be done with same ease in Power Query too.

The only step we missed in Excel implementation is moving average smoothing of the area charts. It can be easily added as a step between 3 and 4 in data stage.

How do you like Joyplot? Would you create something like this for your business / personal data? Share your stories and thoughts in the comments section.

**More joy for you…**

If you love this, you are going to enjoy these charts too.

]]>MR06 was using Goal Seek to change the equity input so that the IRR achieved 25% value. But he wanted to do that for a number of Year 3 price changes in the model.

He could obviously change the Year 3 value manually, then run a Goal Seek and write down the Solution and repeat for the range of Year 3 prices changes.

Or he could have setup a macro to do the same and tabulate the results.

Eventually I provided a solution which uses a Data Table and some formulas to determine these values

This post will examine the technique used to solve the problem for MR06

This is MR06’s model. It is a simple financial model of an investment in some units over a 6 1/2 year period.

The problem is that MR06 wants to know what Equity value **B10** is required to achieve an Internal Rate of Return (IRR) of 25% subject to varying the Year 3 Price, cell **G6**.

This problem can be solved using Goal Seek

Goto the **Data, What-If-Analysis, Goal Seek** Tab

Set the dialogs in the Goal Seek to the values shown above and press Ok

Excel sets the value in cell B12 to the value of $441,910 and shows an IRR of 25.1%. This small discrepancy will be discussed later.

MR06 can now manually repeat that changing the values of G6 between each iteration. Except that MR06 has many many models and wants to do that for variations in the Year 3 Price from $8 to $15.

An alternative method to solve this is to use a Data Table to feed in the starting values for both the Year 3 Price **G6**, as well as a list of Values for the Equity Contribution **B12.**

The Data Table will then retrieve a list of the IRR’s for the various combinations.

Then we can then setup a formula to calculate the Equity contribution required to achieve the target IRR value of 25%

You can follow along with a copy of the worked example: Download Example File Here

Setup a list of required Prices for Year 3 in a Row D15:K15 and a list of possible Equities in a Column C16:C25

I have colored the cells to simplify which cells are linked to which parts of the Data Table.

In the Top Left cell C15 link that to the Target IRR cell using =B12

Now select the whole area **C15:K25** and goto the **Data, What-If-Analysis, Data Table** Tab

Complete the Data Table dialog

Row Input cell: $G$6

Row Input cell: $B$10

Press **Ok**

Excel will now fill in the Data Table area with the IRR’s from the combination of Each Row (Equity) and Column (Year 3 Price)

At this point we now have a table of IRR’s from the combination of Each Row (Equity) and Column (Year 3 Price), but we can see that we don’t have an actual solution, only approximate solutions for each Year 3 Price.

Our goal was to find the Equity value which returned a 25% IRR

We can use the Excel **Forecast()** function to interpolate the exact answers based on the data in the Data Table

Forecast allows us to forecast a Y value based on an X value from a Table of Known X and Y values

You can read the excel help about forecast

We can see from the help that Forecast uses the syntax:

=FORECAST(x, known_y’s, known_x’s)

In our example we are seeking a known value of the IRR of 25%

We have a List of Known X’s which is the IRR’s in the Data Table and the list of known Y’s is the corresponding list of Equities in Column C

To test this in cell **D27** type the following: =FORECAST(25%,$C16:$C25,D16:D25) and press **Enter**

Excel will return **$444,481.25**

This is the equity required to return 25% IRR based on a Year 3 Price of $8.00

You can copy Cell D27 across and you will now have a Table of all the equities required for all Year 3 Prices.

A neater solution is to also allow the IRR to be set as a variable. This way it can be changed and the new equities automatically shown for each Year 3 Price change

In Cell **D29** type 25%

Then setup a Row of Year 3 Prices in cell **D31** type: =D15

Copy D31 across to K1

Then in **D32** type: =FORECAST($D$29,$C16:$C25,D16:D25)

Copy it across to K32

You now have a table of Year 3 Prices and the required Equities to achieve a 25% IRR

You can simply change the Required IRR in Cell **D29** eg 20%

And you will have a Table of the required Equities to achieve a 20% IRR

The good thing about having this as a separate table is that it can easily be graphed

Select the Table and got the **Insert, Chart** Tab

Select an appropriate chart type

In this example the model appears to be very linear, this means there is a direct or straight line relationship between the Equity and the IRR for a given Year 3 Price

If we choose a value of equity of say 30% we can see that we need an equity of $725,020 for a Year 3 Price of 8.00

It is important to note that the value of $725,020 is outside the range of the Equity values we set in Column C

In this case we should check that this is still a valid Forecast

To do this reset the Equity values in Column C to that shown below, ensuring that the equity values extend both below and above the expected equity value we saw of $725,020

We can now see that the Equity value required is $649,577.

Those who are observant will notice that the Equity value of 649,577 should have an IRR of slightly less than 29.61%, not 30%

This is because although the model appears linear, it is in fact not exactly linear.

The model’s non-linearity occurs for a number of reasons but primarily is that XIRR is not an exact function. It is an Iterative Function and to quote from Microsoft Help “*Excel uses an iterative technique for calculating XIRR. Using a changing rate (starting with guess), XIRR cycles through the calculation until the result is accurate within 0.000001 percent.*”

For all intention purposes it is close enough to linear that we don’t notice the differences.

If it was perfectly linear the extrapolation to 30% which was outside the original Equity range would have given us the same result of $649,577 not the $725,020 that it returned. Non-linearity is magnified once the extrapolation used in the Forecast Function is outside the range of the known Y values and we need to always check for this.

This non-linearity causes small discrepancies in the Excel functions like IRR and Forecast, of which we are using both.

Forecast uses a least squares approximation to best estimate a line of best fit, but it is just that, an approximation.

If you are wondering how close the model is to being linear, it is 99.77% fit to being linear

This is calculated using the r value or Pearson() function

D50: =PEARSON(C16:C25,D16:D25)

=0.997747518

That is the closer the value is to 1 the more linear it is.

In Excel versions 2016+ Microsoft has added a new set of functions including: FORECAST.LINEAR()

Forecast.Linear and the other new Statistical Functions “*use advanced machine learning algorithms, such as Exponential Triple Smoothing (ETS)*”

You can read all about them here: Advanced Forecast Functions in Excel 2016+

It is recommended that they are used in model from using Excel 2016 onwards.

I hope you have enjoyed this discussion on the use of Data Tables and the Forecast() Function to replicate multiple Goal Seek commands.

For more information on Data Tables I refer you to my post: Data Tables & Monte Carlo Simulations in Excel – A Comprehensive Guide

Many thanx to MR02 for permission to use his model for this post.

]]>

Let’s say you have a time value (either user input or calculated) in cell A1.

**Nearest second**: =TIME(HOUR(A1), MINUTE(A1), SECOND(A1)).

- SECOND formula rounds up any fractions and returns full seconds.

**Nearest 15 seconds: **=TIME(HOUR(A1), MINUTE(A1), MROUND(SECOND(A1),15))

- Use MROUND() to round up seconds values to nearest multiple of 15 (or whatever else)

**Nearest Minute: =**TIME(HOUR(A1), MINUTE(A1)+(SECOND(A1)>30),0)

- The seconds value will always be zero. We just look at fractional minutes portion to see if they are more then 30 to round up to next minute. The trick is to add up Boolean check (SECOND(A1)>30) to minutes value.

**Nearest 15 minutes:** =TIME(HOUR(A1), MROUND(MINUTE(A1)+SECOND(A1)/60,15),0)

- This one uses MROUND to round total mins (including fraction) to nearest multiple of 15.

**Nearest 37th minute: **=TIME(HOUR(A1), MROUND(MINUTE(A1)+SECOND(A1)/60,37),0)

- Same logic. Just to show you how to round to an arbitrary minute.

**Nearest hour:** =TIME(HOUR(A1) +((MINUTE(A1)+SECOND(A1)/60)>30),0,0)

- Check if total minutes is greater than 30 and add the result to hours.

Let’s test your timing skills. Assuming A1 has date & time value (like 26-Jun-2017 7:21:32 AM), round it **up **to nearest working hour.

- The working hours are 9AM to 6PM on weekdays (Monday – Friday)

Post your answers in the comments section. Tick tock, tick tock… time is ticking, post your answers.

*Always having a hard time working with times in Excel?* Its high time you took some time to learn about Excel time.

- Working with date & time values in Excel – a quick intro
- Convert fractional time to hours & minutes
- Highlighting over due items
- 42 tips for Excel time travelers – calculating past, present and future time values using formulas
- Sorting by birthday
- More date & time tips

- Come up with ingenious solutions
- See if a simpler
*cheat*solution is possible - Sit back and ignore

For most problems, I choose 3rd reaction. Occasionally, I rely on 2nd option and very rarely the first one.

When faced with a tricky time sheet summary problem (as outlined below), after initial lethargy I wanted to solve it.

Imagine some time sheet data as shown below. What is the total time each employee worked?

Of course, we can whip up a SUMPRODUCT concoction strong enough to knock off Mike Girvin unconscious* for a few milliseconds to answer the question.

As shown on this forum thread, there are several fun, creative and smart ways to answer each employee’s total duration using such complex formulas.

* Stop kidding yourself. Mike’s tougher than SUMPRODUCT.

We can cheat by first reshaping the data to something like this.

But how? Simple, we take a saw and chisel to the data. I meant power query of course.

Last Friday, I took a wee walk up to Mt. KauKau to enjoy the views and workout my calf muscles. I couldn’t help but share that beautiful moment with you all. So I recorded a video showcasing beautiful views and introducing the problem. Then, I recorded rest of it from the comfort and warmth of my home. So here we go:

Watch it on our YouTube channel.

**Click here to download the example workbook**. Edit any of the queries to see all the steps. Try replicating them yourself to learn more.

Let me confess. My first solution to this problem is based on SUMPRODUCT. But that is because I didn’t have Power Query on that computer. But I later tried solving it with PQ and I just love the simplicity and power of that solution. Anytime I am wrangling poorly shaped data, I use Power Query instead of a mashup of formulas.

**What about you? **Which option do you prefer and why? Please share your thoughts in comments.

Guess what it is?

**F4**.

That is right. The mighty F4 key. You can use it to repeat any action.

Jo was using it to insert rows in her workbook. After inserting first row (using CTRL+ of course), she would press F4 to add more rows as needed.

So next time you find yourself doing the same thing several times, use F4 and save yourself a few keystrokes or mouse moves.

**Your mission: What is your partner’s shortcut of the moment?**

Take a short break from work. Call / message / ask your partner (or friend, mate, parent or kid) what their shortcut of the moment is? Post it here in the comments section.

What… they don’t use Excel much? May be they should.

More reading: Complete list of Excel shortcuts

]]>

**Start your bar (or column) charts from zero.**

To illustrate why you should do this, let me share a personal example.

Over the weekend, *the *Jon Peltier visited Wellington. He is staying with Jeff (who occasionally guest blogs on Chandoo.org). On Sunday, we all decided to hike up a small mountain near my house for a leisurely family picnic.

While on the top of the mountain, Jo (my wife) took a few pics of us three Excel geeks. As we were standing on a sloping mountain face this is how the pictures look:

Looking at the picture on left, you would confidently say that I am way shorter than other two. But picture on right tells a different story.

Of course, the reality is somewhere between two pictures. It is difficult to conclude who is tall, who is short just looking at the pics simply because the *baseline is sloping. *

**But we can’t have sloping baselines in Excel charts**

You are right we can’t. But we can still confuse people with an arbitrary axis start. Like below:

**The fix? always start your axis at zero for bar (or column) charts:**

Simple. Set the axis start point to zero (Select axis, press Ctrl+1, and from Axis options set minimum to 0).

So there you go. The shortest Excel charting tip ever, but still stands tall when it comes to telling great stories.

**More charting principles:**

If you are in the mood for some more charting theory and elegant methods, check out below links.

]]>**How to show selective sub-totals in Pivot Tables**

First instinct suggests that using Design ribbon > Sub-totals we can tell pivot table how we want the sub-totals.But this is like a master switch. It will turn off or on all sub-totals in the report. What we need is sub-totals only at one level of the report (in this case, sub-totals by department, but not by month).

The answer is simple.

- Click on any month label. This will select all months in the row label area.
- Right click > Field settings (you can also access this from Analyze Ribbon > Field Settings)
- In the “Subtotals & Filters” tab, choose “None”
- You are done.

Happy Pivoting. I am off to a short spin on my cycle before heading to work.