Thursday, June 5, 2014

When Correlations Go To One

Frequent readers know that I am always reading something.

Recently a co-worker lent me Nate Silver's The Signal and the Noise. In the first chapter the author discussed the financial crisis and how mortgage pools change based on the correlation of their risks.

In prior posts (see Mei's Monte Carlo Adventure for example) we have modeled investment portfolios assuming that the correlation between securities was constant. We have always cautioned, however, that this assumption may not be correct.

In his book, Mr. Silver provided some data on how default rates in mortgage pools change under both an uncorrelated assumption and a perfect correlation assumption, and I thought it would be fun to replicate this and to take a deeper dive into what happens when correlations change.

The Uncorrelated Pool

Say we have bought 5 mortgages, and the risk of default on these is 5%. If this rate is too risky for some investors, we can create an investment that they can invest in by using them to create a different type of security.

We place the 5 mortgages together, creating a "pool". We then sell 5 securities to the market, A through E, whose payout will be based on order of default. Securities A through E are called "tranches" of the pool.

If all mortgages pay out, all 5 of our created securities pay investors. If one defaults,then investors in A through D get paid but E does not. If two default, investors in A through C get paid but D and E do not, and so on.

Figure A
Mortgage Pools
Five mortgages are pooled and five new securities are created. Payout for each security is based on the number of defaults.

Figure A depicts this structure.

One way to think about this is to imagine we have large soup pot on the stove. Each of the payments for the five mortgages are the ingredients that go into the soup.

After we stir a little bit, we ladle out the soup, only we do so in a certain order. Person A always gets the first ladle full, Person B the second, and so on. If one of the mortgages defaults, so we only have 4 ladles full of ingredients going into the soup, then there will not be enough for person E when they show up holding out their bowl.

Figure B
Uncorrelated Default Rates
First column is occurrence of the specific event - answering the question 'what percent of the time was there exactly x defaults?'. The second column is cumulative occurrence, answering the question 'what percent of the time was there at least x defaults?'

If we assume that the mortgage defaults are uncorrelated - meaning that if one defaults it in no way indicates that any of the other four default, then the probabilities of default occurrence for each of our newly created securities will be close to that of the results in Figure B, which is R output for a simulation of 10 million runs. Sometimes we will hear the phrase 'the variables are independent', which is stat-speak for uncorrelated.

These results match (with rounding) those produced in Mr. Silver's book.

Back to our mortgage situation, we had the case where an investor would not be able to purchase the 5% default risk mortgage securities because of their risk.

By creating our pool structure, we have created mortgage securities where three of them (A, B and C) have less than 0.12% probability of default, a much less risky proposition for these investors.

Of course, these results only hold if the securities are in fact uncorrelated.

The Correlated Pool

In the prior example the mortgages going into the pot were uncorrelated, meaning that the behavior of one does not impact the other.

In our simulation every now and then a mortgage would default and person E did not get paid. In a few instances 2 mortgages defaulted at the same time, but the other ones did not. In well over 99% of the simulation cases no more than 2 defaulted at any one time, so persons A through C would almost always get paid.

However, if the mortgages are correlated, meaning they move more "together" rather than "independently", this dynamic changes.

At the extreme, if the correlation is 1 (the highest a correlation can go), then when one defaults all five of them default. This means either there is soup for everyone or soup for no one - nothing in between.

In this case, the benefit of carving out different tranches of the pool is of no benefit whatsoever. The investor may as well have invested in one mortgage with a 5% chance of default, because that is the same probability as their perfectly correlated tranche security offers.

Figure C
Bond Prices By Tranche
Bond prices are calculated based on (loss given default * probability of default)/risk-free rate

This brings us to an interesting point. By creating a structure based on one assumption, should that assumption change it creates both winners and losers. Figure C shows theoretical bond prices based on the probability of default and 100% loss under that scenario.

Under the uncorrelated scenario, tranche E is worth about 73.69 cents on the dollar, since it is the tranche that gets hit with the vast majority of defaults.

However, under the perfect correlation assumption it is worth a little over 90 cents on the dollar. All the other tranches experience decreases in value from the uncorrelated to perfectly correlated scenarios.

In total, however, the entire pool of bonds is worth the same amount. This has to be so, as we have not changed the inherent risk of the mortgages (i.e. 5% default probability) in either case, only the correlation.

The Spaces In Between

Between the perfectly correlated and the perfectly uncorrelated, we have the partially correlated.

In Cholesky to the Rescue, we discussed how to create partial correlation between different simulated events according to a target correlation matrix.

But in that situation, we were using the normal distribution. In the mortgage default case, we are using the uniform distribution, one that has a range between 0% and 100%. There is no normally distributed mean or standard deviation to use like there is in the former case.

How do we go about creating a simulation that creates correlation between variables while staying true to the original variables' uniform distribution parameters?

For one answer to this, we can turn to the work done by Enrico Schuman in his post Generating Correlated Uniform Variates.

In this approach, we perform 4 steps.

    ::Create our correlation matrix
    ::Convert the matrix to Spearman correlation
    ::Simulate variables using the Cholesky decomposition of this matrix using normal distribution
    ::Convert the normal variables to the uniform distribution by using their p-values (probability of occurrence)

Under this method, we can use the well-developed normal distribution processes and simply convert the results to the uniform distribution in a manner that maintains the correlation.

Figure D
Default Probability at Different Correlation Levels
As correlation trends towards 1 default probabilities converge to .05

Figure D shows the default rates for each tranche of our pool resulting from one thousand simulations performed at each level of correlation (to the second decimal point, or for each .01). During each simulation the variables were run 100 times, so the total for each correlation level is 10,000.

As the correlation approaches 1, each of the security's default rates converge to 5%. The change in the price of these securities shown in Figure C, will change as the lines in Figure D change (opposite the sign of the change, so increases in Figure D mean decreases in bond prices).

The impact to each tranche varies. Tranche D's probability of default almost immediately starts to increase (thus it's value is decreasing) as things become more correlated, while Tranche A's holds out almost to the very end, but then increases rapidly.

Tranche D also has the interesting dynamic that it's probability of default increases and then decreases - the only security to do so. All the others only increase or only decrease.

What You Can Do

Correlation is a condition often stipulated for a model but not always completely thought out. Yet its' impacts in certain situations can be quite significant. Some questions to ask when analyzing our models or the results from them are:

What happens if our correlation assumption changes? - using Figure C as our example, asking this question allows us to consider the fact that our correlation assumption will drive a 5% change in our investments value.

Figure E
Average and 100-Day Rolling Correlation
Average correlation for entire time period is shown by the colored dashed line, and the rolling 100 day correlation by the solid lines

How strong is our correlation assumption? - For whatever period of time we calculate a correlation, we are inherently assuming that it will apply to the time period going forward. This is almost never the case. Figure E shows the correlation between Short and Long Term Fixed Income funds and US and Global Equity Funds (original data from Should You Rebalance Your Investment Portfolio?) from 1996 to 2012. The entire period correlation is almost never equal to the 100 day correlations, and the 100 day correlations can change dramatically in relatively short periods of time. In addition, there are distinct correlation differences in different time periods, such as equity correlations being slightly higher than .5 (and volatile) for much of the 90's compared to being close to 1 during the past few years (and not so volatile).

Insist on Scenario Analysis - one way to achieve the objectives in our first two "What Can You Do" items, we can insist on reviewing results of scenarios where the correlation conditions have been altered. For example, looking at the Equity correlations in Figure E, some possible scenarios are "Maintain recent high correlation", "Regress to the average", "Intermittent periods between high and low", "Return to the 90's", and "Go to Zero". Comparing the results of these scenarios will help us to understand the risk potential, though not necessarily the probability of each.

Learn From the Traders - a book I read once (though the name of it escapes me) made the claim that in security trading "every trader's position will eventually get wiped out", implying that trading is fundamentally a race to cash out before that eventuality occurs. Long Term Capital Management did great...until it didn't. Mortgage Bond traders and banks did great...until they didn't. So it will go with correlation, at some point the uncorrelated will go to 1, or the strongly correlated will go to 0. We need to make sure we are clear what we will do when that occurs.

Key Takeaways

The assumptions that go into a model are essential to the results it generates. Investigating what can occur if a) the assumptions change, or b) are simply incorrect, is an essential component of building an actionable analytical framework.

You May Also Like
Questions
    ::What assumptions have you challenged that significantly impacted the results of a model's output?
    ::What is the best way to proceed if correlation is uncertain?
    ::Do you agree that eventually "all correlations will go to 1" or "all correlations will go to 0"? Why or why not?

Add to the discussion with your thoughts, comments, questions and feedback! Please share Treasury Café with others. Thank you!


Tuesday, April 8, 2014

The Madness Isn't Over Yet!


We have come to the end of the time period US sports fans refer to as "March Madness" - 68 collegiate basketball teams have competed in a single elimination tournament - and UConn has been crowned "national champion".

In the past we have used this event as a reason to perform a statistical project (see Who WIll Be in the Final Four - Lessons From an Analytical Journey).

This year we'll do things a little different.

Be Strong!

George Gallup founded the firm that bears his name in the 1930's. Their initial claim to fame was public opinion polling. Results of their inquiries were often cited in news media (tv, radio, print, etc.), especially around election season. They have to be one of the original data analytics firms.

At the turn of this century a group of Gallup consultants authored a pair of best-seller books - First, Break All the Rules and Now, Discover Your Strengths - based on Gallup insights developed over the years from their work with companies and industries.

One of the main themes coming out of this research was, as this Forbes article states, a "ceaseless belief that people should focus on making the most of their talents, rather than struggling to mend their flaws".

One of the ways this was described in the books was that the neurons in our brains form millions of connections with each other, which we can think of like a transportation network. Some connections, reinforced through time, are like super-highways, while others, rarely used, are like back country dirt roads.

The super-highways become strengths, while the others do not.

Ultimately, Gallup distilled the number of these strengths to a total of 34 (you can see a listing of these at this Wikipedia entry).

Are Some Strengths Better Than Others?

The question that has always come up for me is that while all the strengths are positive, for the finance practitioner it would make sense that some are better than others.

Put in a slightly different way, if we were given a listing of 20 people, and the only information we had about these folks was their top 5 strengths, who should be the one that we would hire?

So to have some fun with NCAA "bracketology" while seeking to achieve insight into the issues above, this year we will focus on the Strangthsfinders tournament!

Setting Up The Tournament

Unfortunately, 34 strengths do not fit neatly into a bracket structure, since in each round we need a number of teams that is a square of 2 (1,2,4,8,16,32 etc).

In order to achieve this, we have a 'play-in' round, where 4 of the strengths compete for 2 spots. Once this round is complete we will have a field of 32, which can then be arrayed in the normal bracket structure.

The 34 strengths were taken and randomly assigned to two groups: A or B. In each of these groups, a further random assignment of 'seeds' was performed, from 1 to 17. In the NCAA structure, a 1 plays a 16 in the first round, a 2 plays a 15, and so on.

The play-in games were between the strengths identified as the 16th and 17th seeds.

For the play-in games, I solicited votes within my Twitter and Linked In communities, randomly selecting individuals to answer the question: "Which of the following strengths is it more important for a Finance Person to possess?".

Play-In Teams

The first play-in is between:

Activator - one who acts to start things in motion. People strong in the Activator theme can make things happen by turning thoughts into action. They are often impatient

Analytical - one who requires data and/or proof to make sense of their circumstances. People strong in the Analytical theme search for reasons and causes. They have the ability to think about all the factors that might affect a situation

And the second play-in is between:

Empathy - one who is especially in tune with the emotions of others. People strong in the Empathy theme can sense the feelings of others people by imagining themselves in others’ lives or others’ situations.

Responsibility - one who, inexplicably, must follow through on commitments. People strong in the Responsibility theme take psychological ownership of what they say they will do. They are committed to stable values such as honesty and loyalty.

Results

Figure A
Play-In Game Results

Figure A shows that both of these games were blow-outs, with Responsibility and Analytical sprinting easily to the win.

Given the number of job roles the finance profession has with the word analyst in it - financial analyst, budget analyst, business analyst, cost analyst, etc. - it only makes sense that this strength would prevail over Activator, which seems suited to business entrepreneuers.

Yet, as @fundingprofiles, one of the survey respondents, noted "Activators are rare in finance. Analytical minds are too, but not as rare", suggesting that while the talent is valuable not many actually have it!

Perhaps those job titles are simply a matter of wishful thinking?

In the second matchup, Responsibility came out well ahead of Empathy. As a fan of Design Thinking (see Should CFO's Be Design Thinkers), this was a disappointing result since Empathy plays a major role in that process.

Yet, given that Responsibility involves "following through on commitments" and "honesty", these are valuable when the books need to get closed and accurate numbers need to get reported, so we can see that while it might be nice to have a Design Thinking mindset, the ongoing day-in, day-out requirements of the finance role make the one strength more valuable than the other.

As one anonymous respondent stated "Both are good but Empathy without Responsibility would not work". Amen to that brother or sister!

The Bracket

With the play-ins complete, the Finance Strengths bracket now looks like that shown in Figure B. There are lots of potential items to discuss as we move through the "games", so stay tuned, either by adding this blog to your reader, following me on Twitter, Linked In or Google + (buttons for all those are located along the right side of this blog).

Figure B
Finance Strengths Brackets

What You Can Do

Vote in one of the next round matchups!

Having first hand knowledge of the abuse of half hour long surveys, I designed each of the following to take less than a minute of your time - each one has only two match-ups. You don't have to do them all, but completing one would certainly help out for the next round.

Thank you in advance for participating!





These are half of the first round games. Others to come once these results are in!

Key Takeaways

Finance people need to have Analytic strengths and a strong sense of Responsibility.

You May Also Like
Questions
    ::Do you agree that Empathy without Responsibility would not work for a finance role?
    ::Are Analytic strengths rare in those who hold finance positions? Can you share a story about this?

Add to the discussion with your thoughts, comments, questions and feedback! Please share Treasury Café with others. Thank you!



Tuesday, March 11, 2014

Is My Weighted Average Cost of Capital WACC - y?

There are a lot of places where we can get the basic formula for a firm's Weighted Average Cost of Capital (often abbreviated as WACC).

Once we have calculated this figure, it can be employed in a number of settings - investment valuation, performance evaluation, industry comparison, etc.

Yet what is often not easy to find are the issues that arise during this employment - either nuances present in its calculation or the traps that can occur if we rotely employ it thereafter.

What Is Capital Structure?

We use the term "Capital Structure" to refer to the variety of instruments the firm has used to finance itself - short-term debt, long-term debt, preferred equity, common stock, options, warrants, etc.

These items are on the right side of the balance sheet. The left side of the balance sheet has assets. If we net the liabilities of items that are not investments (i.e. providers of these funds do not expect a return, such as Accounts Payable), then our balance sheet will show how these assets have been financed by the firm (see "How to Calculate ROIC" for an example of this netting).

Figure A
Capital Structure

Our balance sheet will provide information like that shown in Figure A - it tells us what classes of assets the firm has and what instruments have been used to finance them.

The money the firm has received for each of these financial instruments comes from investors. In this context we define investors as those seeking a return for the funds they have provided. Banks and bond holders require interest, preferred equity holders require dividends, and common stock holders may require dividends and want to sell their shares at a higher price than what they paid for them.

Each of these investors has a different expectation about what they will receive. For example, a short-term debt investor will generally expect a lower interest rate than a long-term debt investor.

These "return expectations" are in large part driven by the market. We'd all love to see the stock we own quadruple in value over the next year, but it would be unrealistic to expect this result.

Opportunity Cost

The fact that an investor's expected return is driven by the market is due to the concept of opportunity cost.

Opportunity cost reflects the fact that we cannot have everything, and in order to get something we must give up something else. For a personal example, if we decide to go to the movies tonight, then we are not going to play soccer, have a long leisurely dinner, practice guitar, etc. during that time instead.

Financially, if we invest ♢ 10 (for new readers, the symbol ♢ stands for Treasury Cafe Monetary Units, or TCMU's, freely exchangable into any currency at any rate of your choosing) in X, then we cannot invest it in Y.

Thus Y "sets the bar" for X. If we would expect to make 10% per year investing in Y, then we need to do slightly better to choose X instead - perhaps 10.000001%.

The caveat on this is that the cost of capital is evaluated against investments with the same level of risk. We do not evaluate X against Z if Z is a "risk free" instrument and X is a risky one.

Weighted Average Cost of Capital I

So far we have established that a firm has a capital structure, made up of a number of financial instruments, and that each of these instruments has a cost of capital associated with it.

Figure B
Weighted Average Cost of Capital Formula

The Weighted Average Cost of Capital is the proportional sum of the different instrument's in our firm's capital structure's cost.

Generically, absent other factors, it can be computed using the formula in Figure B.

Note that the formula in Figure B is hardly ever used in practice for the simple reason that, absent other factors, the weighted average cost of capital for the different financial instruments must equal the cost of capital of the firm as if it were funded with only equity. This is the Nobel prize winning Modigliani - Miller theorem.

For example, let's say that Joe's Agricultural Empire, LLC is in a business where the level of that risk is priced at 10% in the market. In other words, assets with the same level of risk are going for 10%, and therefore investor's require a similar rate of return for Joe's.

Suppose that Joe's produces ♢ 10 of cash flow for investors per year, and is expected to do so forever. We can use the dividend discount formula to calculate the equity value of Joe's at ♢ 100 (10/0.1).

Now imagine that we issue a financial instrument that's worth 50% of the value for Joe's, and the market prices the risk of this security at 8%. Why does the market price this differently? Because the risk characteristics of this security are different than the underlying assets.

It could be that it has a priority claim on the cash flow, meaning these investors get paid first, before the equity investors, so in the year's where the expected cash flow of ♢ 10 is not met, they get a bigger slice of the pie. Or it could be that the payout to these investors is protected from some of the risks Joe's faces, such as market prices (perhaps for a cost input or revenue generating factor).

Figure C
Weighted Average Cost of Capital Formula

Again using the dividend discount formula, after using algebra to rearrange it, the cash flow to these investors is ♢ 4 (4/.08 = 50, which is 50% of the value as stipulated).

The key point for the pricing is that, for the market to price this instrument differently, there needs to be a reason for it related to risk. If the opportunity cost for this instrument is 8%, this instrument has a different risk characteristic than Joe's as a whole to make it that way.

The equity investors will get the remaining ♢ 6 of the expected cash flow, and the value of their investment is ♢ 50 (half the firm's value as stipulated), so their opportunity cost is now 12%. Figure C shows the calculations for this example.

Again, the reason the equity is priced differently is related to risk. Recall that some of the risks for the new instrument's investors was 'taken off the table', resulting in their pricing the new instrument at 8%. Yet, the remaining risk does not go away. It therefore is now disproportionately borne by the equity holder. Thus, higher level of risk, higher level of return required.

We now have a capital structure of 2 instruments, one at 8% and one at 12%, each representing 50% of the firm value. The weighted average cost of capital is 10% (50% * 8% + 50% * 12%), the same as before!

This leads to the Modigliani-Miller conclusion: how the firm is financed does not matter! We can think of it like squishing a balloon - if one part is less the other part becomes more. The volume of air in the balloon is the same, it is merely distributed differently. The same for the risks in the firm and how they are shared by investors.

The key takeaway from this is one of simple common sense: there is no such thing as a free lunch.

Weighted Average Cost of Capital II

The natural question to ask after this exercise is "why bother"? If the parts equal the whole, why bother breaking it down into parts when all that happens is that we get back to the same place? If financing doesn't really matter, let's keep it simple.

The key phrase from the last section is "absent other factors". There are other factors.

The biggest of these is taxes. Because interest is a deductible business expense, financing all of a sudden can drive value through the creation of the so-called "Tax Shield".

Figure D
Weighted Average Cost of Capital Formula

Figure D shows the value of the two components shown in Figure C with the addition of the interest expense deductibility. The example uses 40% as the assumed tax rate. Ultimately, financing with 50% debt adds about ♢ 13 of value to the firm through this deductibility.

Notice that the final capital structure is no longer 50% for both of the securities - the one is worth more than the other.

We can get back to 50% if we iterated through the valuation a few times. Set debt at 50% of ♢ 113, see how close we get to 50% with the new tax shield, adjust, and repeat again and again.

Figure E
Weighted Average Cost of Capital Formula

The other way this problem is solved is to modify our Weigted Average Cost of Capital equation to that shown in Figure E.

In this figure we have added a second term to the formula to deal with instruments that generate tax shields. If an instrument does not generate tax shields, it is evaluated by the first term in the equation. If it does, it is evaluated by the second term.

Figure F
WACC Example Calulation

We can now go back to our Joe's Agricultural Empire LLC example. Figure F shows that by using our new formula, we calculate a weighted average cost of capital of 8.4%. Using this rate in our dividend valuation model, we arrive at a value of a little more than ♢ 119 for Joe's.

The beauty of the WACC approach is that we have captured the fact of tax shield generation in the denominator through the (1-t) term, so in the numerator we use the cash flow from the example before (called "unlevered" because they do not account for debt). We do not have to go through the effort of modeling the cash flows of all the various instruments in order to value the company, the WACC has done this all for us!

Simplicity has its benefits!

And it also has its drawbacks.

Problems with WACC

Using the Weighted Average Cost of Capital makes a number of assumptions.

Figure G
Term Structure Trap
Given the term structure in the upper portion, the payments in the first column of the lower portion are discounted in two ways, the first using the term structure and the second using the Year 3 debt rate. This results in an error of slightly more than 4% in valuation.

First, it assumes that rates are constant through the period we are considering. This is hardly ever the case in reality. Interest rates generally exhibit a "term structure", they vary depending on how long the debt will be outstanding. If we are going to pay off debt in 1 year, this rate is almost always lower than if we are going to wait 30 years to pay it off.

The general approach most take is to use the debt rate that reflects the length of time we are considering. If we need to use our WACC to value a 10-year payment stream, we will use the 10-year debt rate. If its 20 years, we will use a 20 year rate, and so on.

Figure G shows an example of this shortcut and how it results in a mis-valuation of the payments. We use the 3-year debt rate to value 3 years worth of cash flows. Because the cash flows are "lumpy" (not the same in every year), the valuation of them are driven more by the shorter term rates than the 3 year rates.

The solution to the above would be to use the actual term structure, weighted in accordance with the cash flows being evaluated by the WACC. Yet this can be difficult to do. Most companies do not have enough debt outstanding to establish a term structure on their own. In addition, there are points in the "term structure" spectrum which are more liquid than others - generally maturity related, such as 5 years, 10 years 20 years - which can impact pricing. There is less activity off these cycles, so if we have a 13 year cash flow stream it can be difficult to establish a term structure.

Awareness of this dynamic can be critical. If a company has used the WACC calculation to establish a "hurdle rate", which is the rate at which all investments will be evaluated, and has used long term debt rates to do so, then the business unit manager who has a 1-year project may be at a disadvantage, as their hurdle rate would be much lower had the WACC been calculated according to their shorter time horizon. This places the firm in a position where they will reject short term projects that would have in fact added value to undertake.

A second issue is the underlying assumption that the capital structure is constant throughout the term. Because we have calculated the value of tax shields and placed this cost of capital into the WACC, when this rate is used to discount cash flows the capital structure making up the WACC is assumed to be in place at that point in time.

So if we have a 13 year project we are valuing using our WACC, we are assuming that the capital structure (say the 50-50 in our Joe's eample earlier) is in place the entire time.

This again diverges from reality for a number of reasons.

First, capital structure is fluid, yet not generally thought of in that way. When we want to compare our capital structure to others, or earlier periods in our history, we generally use the balance sheet to obtain these numbers. The balance sheet is a "snapshot in time" portrayal of the assets and liabilities of the firm on a certain date. Yet, the assets and liabilities at a quarter end (the time usually for which balance sheets are prepared) may bear a different relation to the assets and liabilities in mid-month.

Figure H
Debt Capitalization Comparison

Second, a lot of the debt companies issue have "bullet" maturities, meaning all the principal is due at the end of the debt term in one lump sum payment. This will lead to the capital structure being uneven.

Figure H shows the debt percentage of a project's capital structure for an amortizing debt instrument and a bullet debt instrument. The bullet debt instrument is consistently higher than 50% of the capital structure for all years except for the first.

Figure's I and J show the detail behind this example. Note that for the amortizing debt case in Figure I, the respective internal rates of return (IRR) for both the debt and equity match their cost of capital as used in the WACC formula. In Figure J, they do not.

Figure I
Financed With Amortizing Debt
Cost of capital assumptions in the top left. Valuation of the cash flows using this are in the top right. The distribution of the cash flows for return on capital and return of capital are in the middle left portion. The distribution of cash flows for the debt and equity portions are on the right. An internal rate of return comparison is on the bottom left.
Figure J
Financed With Bullet Debt
Cost of capital assumptions in the top left. Valuation of the cash flows using this are in the top right. The distribution of the cash flows for return on capital and return of capital are in the middle left portion. The distribution of cash flows for the debt and equity portions are on the right. An internal rate of return comparison is on the bottom left.
Figure K
Perpetuity Financing

The Figure I and Figure J comparison highlights the fact that risk is impacted by how the debt payments are structured. In Figure I, the debt is amortized in proportion to the value 'consumed' by the project during that period of time. In Figure J, it all comes due at the end of the project, which means that equity holders have received their returns earlier, and actually have to contribute amounts in the final year to pay off the debt. This places the debt holders in a much more risky position than the ones in Figure I.

Figure K shows the case when the cash flows to be valued will go on forever (known as a perpetuity). In this instance, we can value the cash flows using the WACC or separately as debt and equity and arrive at the same value. This occurs because we have not violated the constant capital structure condition.

Another common problem occurs when we need to analyze a new project or investment. Since the WACC is driven by opportunity cost, our new project or investment will require a different WACC if the risk level is different than our other assets.

Going back to our Joe's example, we have a WACC of 8.4%. Let's say the company is evaluating whether it should buy some trucks to transport all their products around. Using the WACC calculated earlier will not be appropriate, because that was based on the opportunity cost of the agriculture business, not the trucking business. Owning and operating a fleet of trucks has an entirely different risk profile, and therefore requires a different WACC.

A shortcut often used is to use the face value of debt as a proxy for the market value. This is due to the fact that market values of debt can be difficult to determine. Debt securities are not traded everyday, and there is no end of the day ticker to establish value.

Sometimes face value is a 'good enough' approximation, but sometimes it is not. In Figure K, our WACC was established on 8% debt. If interest rates have gone down to 6% since that debt was issued, it will now be worth ♢ 79.37 rather than ♢ 59.52, a 33% difference!

What You Can Do

While the WACC calculation, as shown through these examples, has its fair share of warts, it nevertheless still has benefits.

It's a useful data point if taken in moderation - estimating our Weighted Average Cost of Capital provides a metric for us to refer to when valuing projects and opportunities, or for evaluating performance. This is better than nothing. The thing we need to remember is that it is not a 'be all, end all' calculation, but one piece of information among many.

Triangulate the WACC - calculating our WACC using different assumptions (changing the debt terms for instance) gives us a sense of the range of the measure. So while we may not be entirely confident that our 'true' cost of capital is 8.46731%, we may establish that it is likely to be somewhere between 8% and 10%. Another way we might establish a range is to calculate it for other firms that are similar to ours.


Key Takeaways

The Weighted Average Cost of Capital is commonly used to value a series of cash flows. However, it is prone to error because the underlying assumptions often do not match reality and/or the shortcuts applied in practice distort its value. Because of this, one should consider this calculation to be an approximation or indication rather than an absolutely correct value.

You May Also Like
Questions
    ::When was the last time you applied the WACC in an analysis?
    ::How many of the assumptions and shortcuts mentioned have you seen practiced?
    ::What other factors cause imprecision in the WACC calculation?

Add to the discussion with your thoughts, comments, questions and feedback! Please share Treasury Café with others. Thank you!

Monday, February 17, 2014

Why Projects and Budgets Have Fluff - a Lognormal Explanation

Tuck Wallace, the company's CFO, was calling her. She could see it on the Caller ID.

"This is Mei" she answered. Protocol dictated that you never acknowledged that you knew who was calling, even if you did.

"Hi Mei, it's Tuck Wallace" he said, also following protocol (he knew everyone had Caller ID too!).

"Hi Tuck, sure hope you weren't involved in that massive freeway tie up due to the bad weather"

"No, thank goodness. I come in from south of the city and the problems today were mostly north of town. Did you?" he asked.

"I'm usually in before there's too much traffic, so it was a little slow today but manageable. What can I do for you?"

"How soon do you think before you get a complete set of numbers for the budget?"

Mei laughed. "Tuck, I thought you knew better than to ask a question like that!"

The lognormal curve, one of a class of assymetric statistical distributions, has many uses, but only rarely do we get to use it to understand our company's communication and behavior.

The Twin Tension

Eliyahu Goldratt, a pioneer in the management system known as the Theory of Constraints says that a manager needs to guard their reputation within the organization against two things - being considered unreliable and being considered uncredible.

The consequences to this are dire for the person. If we are unreliable, then when the big, skill-developing, challenging, and high-impact projects come along, we will not get picked for them.

If we are not credible, then we are in a situation similar to the "Boy Who Cried Wolf" - we are not believed even when we should be believed, because our reputation has preceded us. Thus, our insights, advice, and analysis will be ignored during the company's decision making process, eliminating our ability to impact the organization and/or our industry in any meaningful way.

Once these labels have been placed it is difficult to get them removed.

Which leads us to the conclusion that the best state of affairs is to avoid getting them stuck to us in the first place.

But why is reliability and credibility a tension?

The Project Manager's Dilemma

The situation Mei finds herself in at the opening of this post places us in the right context - how long do we tell someone it will take us to complete a task?

Simple question.

Simple answer? Not so much.

Estimates of time to completion are critical factors during the project management process. Many of the project manager's tools - Gannt charts, critical paths, completion timetables, schedules, bottleneck flags, milestones, etc. - rely on these estimates for the various tasks that make up the project.

Consequently, project managers (such as Tuck Wallace is if we view the budget to be a project) are obsessed with the status of task completion and the timing of it.

According to Goldratt, if we say it will take 10 days and we get it done in 2, we will lose credibility. If it happens a couple of times, people receiving our estimates will begin to shade them back in their minds - "the last few times they said 10 hours and it took them 2, so they said 5 this time so it will probably take them 1"

In this situation, even if the person does it in 3, they will be considered late since the project manager was expecting 2, even though it was within the estimate of 5 that was given!

Similarly, if we say that it will take us 5 hours to complete the task, and it takes us 10, then we will be considered unreliable. "Those guys are always late" "We can never believe their estimates"

One of the wrinkles in this process is that the estimate range is not normal, but skewed. Time cannot be negative ("it will take us -2 hours"), so rather than being able to think about things from an evenly balanced, plus or minus, viewpoint we are faced with an unbalanced situation.

The problems will all occur on only one end of the range. If things go well, the time doesn't change that much. But if one thing goes bad, and then another, and then another, then the time to completion is dramatically increased.

Lognormally Speaking...

This skewness in results means we can picture this estimation process using the lognormal curve.

Figure A
Lognormal Time Estimates

Figure A shows a lognormal curve based on a normal distribution with a mean of 1 and a standard deviation of 1. There are several items of interest in this. Let's say that this distribution is our estimate of time to complete a task.

The median value, 2.72 days, is the "pure coin flip" outcome. There is a 50% chance of being higher and 50% of being lower.

Do we provide this estimate?

Under Goldratt's paradigm, we essentially seal our fate one way or the other. We have a 50% chance of not being credible, and a 50% chance of being unreliable - neither a very good outcome!

The average of this distribution is 4.48 days. The fact that the average is so much higher than the median reflects the fact that there is a long tail on the right side of the distribution but not on the left. Things can only be so good, we can get the job done this instant but not any sooner, but if things go bad it can take a really, really, really long time.

Do we use the average as our estimate? In 69% of the cases, we will be below this estimate (and hence not credible in the future).

The shaded regions of Figure A show the values of the bottom 20% and top 20% of the outcomes. There is a 20% chance we will take less than 1.17 days to complete this task. We could say it will take at least 1.17 days and be right 80% of the time, though this merely provides "a floor" for our estimate, and does not give the project manager anything meaningful to put into their software for task completion.

We could also say it will probably take less than 6.31 days, and likely come inside that mark. With the wide range of possible outcomes significantly below that, our credibility factor will again be at issue if we use this approach.

Finally, we could use the bottom and top to establish a range - "In the majority of cases, we will get this done somewhere between 1.17 and 6.31 days". We would be within this range over half the time, so the odds are slightly in our favor of being credible.

The big problem with the range approach is that the people hearing it do not like it. A single number needs to go into the software, or a significant event or task needs to be scheduled for a certain date.

An example of this would be booking a conference room at a hotel for the project team after a certain milestone has been completed. They simply can't call up the hotel and say "We'd like to reserve a conference room, can we come sometime between this Monday and next?"

So instead, upon hearing a range, the person gets frustrated and exasperated and says "Just tell me the answer!" and, as they are saying that, the 'thought bubble' running through their mind goes "they are trying to avoid being pinned down to something, they do not want to be held accountable, therefore they must be a slacker. I must continuously watch this slippery character."

Again, not good for our "brand" no matter if our estimate is correct or not.

The "Solution"?

There is a common way to avoid this dilemma.

Step 1: We determine an acceptable "risk" of not meeting the estimate and provide that amount. In our Figure A example, let's say we want to have 80% confidence, so we say it will take us 6.31 days.

Step 2: we do the work.

Step 3: if we get our work done before our estimate (which will happen 80% of the time), we wait to deliver it at some point close to the estimate we provided.

In essence, we establish enough "cushion" in our estimate in order to protect ourselves most of the time. In this way we avoid being labeled unreliable and uncredible!

Changing Perspective

How does our ideal solution look from the organization's point of view?

If we assume our project task is one of ten that make up the project, and all the project steps have the same distribution as Figure A, then there is a lot of waiting time that has been "baked in".

For our task, which we will deliver in 6.31 days, exactly half of the time we will have gotten it done in 2.72 days, so there is just about 3 and 1/2 days of waiting. Multiply that figure by 10 for the number of steps, and our project could be done a month sooner!

For those of you who have read Cholesky to the Rescue, you might remember that when we combine statistical distributions the standard deviation is lower for the combined outcome vs. the single outcome, so perhaps it is not as bad as all that.

In order to compare outcomes between a "deliver when finished strategy" and the "cushion strategy", we simulated in R this ten step process, assuming in the first that tasks were delivered as completed and in the second they were delivered in 6.31 days unless they took longer. We further assume that all steps are not correlated (thus no Cholesky required).

Figure B
Days to Completion
Probability of completion of project. Days to complete a project is consistently higher except at the very extreme upper end.

Figure B shows the outcome of this simulation by plotting completion times in sorted order. Convergence of the two lines (i.e. when the completion time is similar), occurs less than 1% of the time, and in fact the Cushion Strategy never is less than the Deliver When Finished approach.

Figure C shows the R output of the statistics for project completion times. The difference between the two is about 30 days no matter whether we use the median as the average or the mean.

Figure C
Days to Completion Comparison
The "Cushion Strategy" is about 30 days more than the "Deliver When Finished" approach when comparing on both median average and mean average basis.

The standard deviation of the series is slightly lower for the Cushion approach, which is not at all surprising since the vast majority of the times each task has a constant 6.31 day completion time. The standard deviation of the Deliver When Completed approach is 18 days, which is much less than the 5 day standard deviation for each individual task multiplied by 10.

Thus, from the organization's point of view, this cushioning adds waste and inefficiency into its performance.

Budget, Shmudget

We have seen how the act of adding cushion, essential to the individual's political survival, adds to organizational inefficiencies.

It is also financially inefficient as well.

Why?

We can 'transpose' the dynamics of the time to project completion estimates just discussed to budget estimates. When using the 'transpose' term we are thinking about it in a musical context - if we first sing "Do Re Mi" in the key of C, starting at that note on a piano, we can then sing it in a different key, such as G, starting at that point on the piano, and the melody is still recognizable as "Do Re Mi" (just a little higher or lower). This occurs because the steps between the notes are exactly the same no matter where it is played - we have merely started the sequence at a different point on the piano.

Likewise, the steps in the budgeting process are similar to the project time estimation process in a number of ways:

    ::From a cost perspective the numbers cannot go below 0, meaning our distribution will be lognormal or otherwise skewed.
    ::Since there are generally consequences to not hitting budget - such as low bonuses, loss of jobs, etc. - people are inclined to 'cushion' in order to prevent these downsides.
    ::There are the same negative reputational consequences - loss of credibility and reliability. Once we deliver 10% under our budget for the year, our budget estimates for the next year and every year thereafter get shaved back from our original proposals.
    ::Life is a lot harder once we go over budget. After we break through that threshold, every additional expense is intensely analyzed, reviewed and questioned - "Hey, you don't really need to read the Wall Street Journal, do you?"

Figure D
Budget Estimates

Figure D shows a lognormal curve for a potential budget scenario. This curve was created with a lognormal average of ♢ 10 million (for new readers the ♢ symbol stands for Treasury Cafe Monetary Units, or TCMU's, a currency freely exchangable with any other currency at any exchange rate of your choosing) and a standard deviation of ♢ 2 million.

This curve is less skewed than the labor estimate one, with one of the results being that the mean and median are much closer together (in a normal curve mean and median are the same), but the skewness is still there. From the median to the lower 20% is a difference of about ♢ 1.5 million, while to the top 20% it is ♢ 1.8 million.

And this is part of the lognormal lesson - the more cushion we need, the further out the curve we need to go to get it, because of the assymetry on that side.

Rather than simulate this as 10 equal steps, we can think about this in terms of proportion. If we are cushioning at the 80% level, we have about ♢ 1.8 million. This is about 18% of the median. We can then apply this to the total budget - if it's ♢ 100 million we have ♢ 18 million of cushion, if it's ♢ 500 million we are likely to have about ♢ 90 million.

That is a lot of cushion!

Furthering the problem is that cushion is unlikely to be spent to serve major organizational objectives. Since the point of cushion is to buffer our margin of error, we can't let others know it's there, otherwise we are back to the original unreliable / uncredible dilemma. So if we are spending cushion it cannot be anything too noticable. So objectives such as share buybacks, new growth investments, etc. do not benefit from this approach. Things such as Wall Street Journal subscriptions do.

From the CFO's point of view, this dynamic translates to the external communication function as well. The market's reaction to earnings surprises also reflects an assymetric tendency, where negative ones are reacted to much more than positive ones (for further info see this research paper by Skinner and Sloan at the University of Michigan). We can interpret this using Figure D - "if they cushioned, things must be so bad they wound up in the shaded regions".

What Can We Do?

It is tempting to say "let's get rid of the budget" as a response to the above. However, since a lot of organizations rely on budgets as part of their governance process, this is not always a realistic alternative.

Part of the difficulty here is that by adding these cushions we are in some respects lying. We believe we will deliver things inside these numbers the vast majority of the time. And, should we be successful at that, we are going to pretend we didn't so far as anyone else knows.

So what other things might we be able to do?

Blame it on the model - if we use a model to calculate our estimates, and we let everyone know it, sometimes we are able to attribute estimate errors to the model usage. "The Black Scholes formula really did not do a great job of forecasting office copier expense this year". This is similar to the division President attributing the Net Income shortfalls to "Allocated Overhead", which is based on a complex methodology nobody understands.

Tolerate ambiguity and uncertainty - underlying the topic of this post is the fact that we need to provide estimates that people can rely on. This is fundamentally impossible, since none of us can predict the future. The more uncertainty that we can tolerate, the more we can accept ranges of possible outcomes without needing to know the answer. After all, the only certainties in life are death and taxes, right? So in the case of budgets and projects, just suck it up and deal with the fact that there is uncertainty.

Don't tie negative events to the estimation process - if we do not get a bonus if we go over budget, the more cushion that budget will contain. If we separate the budget process from the organization's carrots and sticks, the less need there will be to add cushion. We can have more open, frank and candid dialogues about the driver's of the numbers, which ultimately are the items that need to be managed.

While the actions above will work more often than not, by far the best thing we can do is:

Establish a great working relationship with others - budget and project estimates are few and far between in the grand scheme of things. If we demonstrate reliability and credibility in our ongoing relationships, day in and day out - with our bosses, those upstream and downstream in our workflows, and everywhere alse that we can - then an occasional blip here and there, for a good reason, will be forgiven because of the strong relationships we have established. However, a relationship is one that is reciprocated. We can have a great relationship with someone, but if that someone will 'throw us under the bus' if it serves their interests, then the relationship can only be developed so far.

Take the time to explain - if we sit down with the other and 'step them through' the thought process and modeling that have led us to the conclusion we've arrived at, we provide them a number of opportunities: appreciation of the seriousness with which we have responded to their request; ability to ask questions about the process, information generated and the results; and a tangible experience of the train of thought that we ourselves have experienced. Because this has occurred, they are in a much better position to understand why things might come in better or worse than originally estimated. They no longer intepret the event in terms of our credibility or reliability, because they themselves had originally come to the same conclusion themselves.

Key Takeaways

In order to be impactful, we need to maintain a reputation of credibility and reliability within our organization. Tasks that require estimates, such as projects and budgets, are ones that can potentially undermine this reputation. Understanding the skewed nature of the estimation process helps us manage these potential impacts.

You May Also Like
Questions
    ::What are actions you have been able to take that made your estimates easier to deliver?
    ::What suggestions would you give to Mei to help her answer Tuck's question?

Add to the discussion with your thoughts, comments, questions and feedback! Please share Treasury Café with others. Thank you!

Friday, January 31, 2014

Cholesky To The Rescue!

Gadil Nazari hit the speaker button and dialed. He looked back at his computer screen while the phone's ring played through the speaker.

"This is Mei"

"Ah Mei, I'm so happy you're in. This is Gadil in Engineering. How are you today?"

"Well, all the kids have a touch of the flu, but with the amount of soup we have, they'll be better in a jiffy! Besides that things are going really well. How are you?"

"Good, good"

"And Jana? Was she able to finish her business plan?"

Gadil's wife had recently put together a proposal which she was going to vet with some local venture capital firms. A real entrepreneuer's entrepreneuer. He liked that Mei had remembered.

"Yes! She got a lot of great feedback and decided to pivot a bit and is now prototyping some of the new concepts." Gadil decided to shift the conversation back to the task at hand. "Listen, the reason I'm calling is that we got a consultant's proposal for a project which has some high level simulations. Would you be able to review it? I think your input would help us make sure we are understanding what we are getting."

"Yes, absolutely! I love to see what others are doing in that field. Do you have something you can send to me or would you like to meet later today or tomorrow?" Mei asked.

Gadil moved his hand to the computer's keyboard and hit 'send'. "I just emailed you their presentation. Once you've looked at it can you give me a call and we can talk about how to proceed?"

"Yes, sounds good Gadil. I'll look at it later this morning and get back to you. Talk with you soon!" she said cheerfully as she rung off.

When we perform a Monte Carlo simulation using more than one variable, we need to account for the interplay of these factors during the simulation process.

One means to do this, which we have utilized in prior posts (see Mei's Monte Carlo Adventure or Should You Rebalance Your Investment Portfolio?), is to use the Cholesky process.

Who the heck is this Cholesky guy and what process did he develop?

The Multi-Variable Problem

One of the most common statistical distributions we simulate is the standard normal distribution. Random draws from this will have an average of 0 and a standard deviation of 1 - nice, easy numbers to work with.

Figure A
The Standard Normal Distribution
The standard normal curve's pattern has a distinctive bell shape. The average of 0 is the most likely occurrence, and decreases as we travel further from it.

The shape of the normal distribution's probability density function (stat speak for "what are the odds a certain number shows up?") is the bell curve, which is shown in Figure A.

My sons and I are in a program called Indian Guides (a program promoting father-son relationships), and our 'tribe' recently participated in a volunteer activity for the Feed My Starving Children organization.

Our task that night entailed filling bags with a concoction of vitamins, vegetables, protein, and rice. These bags were then sealed, boxed, and palleted, ready to be shipped the following day to any of a number of locations around the world the charity serves (Haiti, Phillipines, Somalia etc.).

Figure B
Helping to Feed Starving Children
Indian Guides filling 'Manna Packs' to be shipped around the world to feed those in need.

Let's say that at each step of this production process there was a 1 gram standard deviation from the target amount of the ingredient. In other words, 1 gram deviation each of vitamins, vegetables, protein, and rice.

What is the standard deviation of the package?

Under purely additive conditions, this would be 4 grams. However, the combination of the two samples produces something much less than that. Figure C shows the statistics for this process. While all the ingredients have means close to 0, as does the total bag, and while the standard deviations of all the ingredients is approximately 1 individually, the standard deviation of the total bag is only about 2, not 4!

Figure C
Bag Fill Statistics

In order to understand why this is the case, we can think of what happens with dice. If we roll one die, there is an equal 1/6 probability of each number coming up. This is what is called the uniform distribution. If we roll two dice, while each one of them has a 1/6 chance of turning up a certain number, the sum of the two together is no longer uniform distributed. There is a much greater probability of coming up with a 7 as opposed to a 2 or 12, because there are many more ways to make the 7 (3+4, 4+3, 2+5, etc.) than a two (1+1 only).

Figure D
Dice Probabilities
While each individual die has an equal probability for each of its outcomes (the green and red), the combination (gold) is no longer uniform.

Figure D shows a representation of these probabilities.

The same phenomenon occurs with our simulated normal distributions. If we imagine two bell shaped curves side by side, the combined curve will be like the dice, where there is greater probability of middle numbers and less of extremes, thus our combined standard deviation of 4 standard normal curves is only 2 instead of 4.


Enter Correlation

Mei sat across from Gadil in his office.

"The consultant analysis is in some ways inconsistent with our experience" Gadil explained. "And we are not sure why. The are convinced that they have modeled the correct parameters and therefore the results are the results"

"Gadil, is our experience that things fluctuate more widely or less?" Mei asked.

"Oh, definitely more widely" he replied.

"I see. I wonder if we could talk a little about these different variables and what they mean"

Up to this point we have considered the fluctuation of each of our variables to be independent, which means each one varies of its own accord without any consideration of the other, just as if one of our die shows up with a 2, the other is still equally likely to be any number between 1 and 6. The second die does not care that the first one came up with a 2 - it's thinking on its own!

What happens when our variables are no longer "independent", but the one impacts the other?

We can think of common situations where this occurs. The chance that we will get in a car accident is influenced by how good of a driver we are. Under normal conditions, the 'how good of a driver we are' factor will dominate. But when the weather is bad - snow, ice, rain - our chances of getting in an accident will increase. Our overall 'how good of a driver we are' ability is correlated with the weather conditions.

Correlation is the statistician's term for 'one thing moves in a relation to another'. However, we must be careful with correlation because some people confuse it with causation. Two or more things may vary in a relation, but it is not necessarily the case that one may be the cause of the other. There are five reasons why two factors may be correlated, only one of them being that A caused B (see this Wikipedia entry for more on this).

For our simulation purposes, we want to ensure that we create the correct correlation without modeling causation. We are able to accomplish this through the Cholesky decomposition.

The Cholesky Decomposition

Figure E
Correlation Matrix and Notation
The correlation matrix is shown with the numbers and the symbols

Andre Cholesky was a French mathematician who developed the matrix decomposition for which he is known as part of his surveying work in the military.

The 'decomposition' he created comes from the insight that a matrix, such as C, can be broken down into two separate matrices, T (a Lower Triangular matrix) and T transposed (transposing a matrix means reversing its order, in this case resulting in an Upper Triangular matrix). Let's unpack this very dense definition.

Let's say we have a correlation matrix with 4 variables from our Feed My Starving Children process. We can identify the components of the matrix by using row and column notation in the subscripts. Figure E shows our correlation matrix in numerical and symbolic form.

Figure F
Triangular Matrices
The Upper Triangular Matrix (top) and Lower Triangular Matrix (bottom) in symbolic form

Triangular matrices have values in one part of the matrix and 0's in the other, thus creating a triangular pattern. Figure F shows these symbolically.

In the Cholesky decomposition, we can break down our correlation matrix into a Lower Triangular Matrix and an Upper Triangular Matrix with transposed values. In other words, the value in row 2, column 1 in the Lower Triangle becomes the value in row 1, column 2 in the Upper Triangle. You can think about these matrices as being similar to square roots of numbers.

To show the entire decomposition then, we have the matrix equation shown in Figure G.

Figure G
Cholesky Matrix Equation
The Correlation matrix is the product of a Lower Triangular Matrix multiplied by the same values transposed into an Upper Triangular Matrix

Figure H
Cholesky Factors' Formulas
On diagonal factors (where the row equals the column) use one equation, while the other factors use a second. Since it is a Triangular matrix, the other part is simply 0. Inputs to the equations are either the Correlation matrix - C , or the Cholesky Triangular Matrix - T.

The elements for each part of the Lower Triangular Matrix can be calculated using the formulas in Figure H. The equations vary depending on whether the element is "on the diagonal" or not.

The website Rosetta Code has code for the calculation of these factors in a number of languages (Python, PERL, VBA, etc.). I made a spreadsheet that lays out both the covariance and cholesky matrics based on the inputs of weights, standard deviations and correlations which you can get here.

In R (R is an open source statistical software) it can be calculated using the chol() function. However, in order to ensure I could calculate the equations without assistance, and to practice my R skills, I also programmed the formulas "by hand". If you would like that code, along with the rest of the code used in this post, you can get it here. As always, good analytic practice requires that you check your work, and I verified that the "by hand" formula did indeed match the chol() function's results.

Now What?

Mei was seated in a conference overlooking the city below.

"How do you control for the fact that mixing distributions lowers the standard deviation?" she asked the consultants in the room.

"We don't have to do that because the factors are independent." one of the consultants, George, replied. "Each distribution stands on its own."

"Perhaps, but then why is it that the results do not match up with our data?" she continued.

Now that we have a Cholesky matrix, we can continue with our simulation process.

Matrix multiplication requires that the first matrix has the same number of columns as the second matrix has rows. The resulting matrix will be a one with the same number of rows as the first and the same number of columns as the second. Figure I shows this pictorally.

Figure I
Basic Matrix Multiplication
A matrix with m rows and n columns multiplying a matrix with n rows and c columns results in an m row, c column matrix.

With a row of random numbers (4 in our Feed My Starving Children example), we will have a 1 x 4 matrix for the variables, a 4 x 4 Cholesky matrix, with an output matrix of 1 x 4. Figure J is an example of one calculation using this method. Notice that the Lower Triangular Cholesky matrix we created has been transposed so that it is Upper Triangular.

Figure J
Simulation Multiplication Example
A 1x4 vector of random numbers is multiplied by one of the Cholesky columns (a 4x1 vector), resulting in a single value (i.e. a 1x1 matrix) for the new variable.

If we calculated the Cholesky values using the correlation matrix, the resulting values (we can call them "Adjusted Random Variables") are then multiplied by each variables' standard deviation and added to the mean for that variable, which completes the result for one simulation. The spreadsheet I mentioned earlier has an example of this calculation for 1000 random variables.

If we calculated the Cholesky values using the covariance matrix, then the standard deviations have already been "scaled in", so we merely need to add the mean to the Adjusted Random Variable.

Figure K shows the results for each variable using R for the simulation, along with the correlation matrix. Note that these values are similar (as they should be, as this is the whole point!) to the correlation matrix in Figure E.

Figure K
Simulation Comparison
First summary is the Standard Normal Variables, whose means are close to 0 and standard deviations are close to 1. The second summary is the Adjusted Random Variables. These means are also close to 0 and standard deviations close to 1. Due to correlation, the Totals differ between the two. Because of the correlation impact, the standard deviation of the 2nd group is higher (2.58... vs. 1.93...), even though the individual elements have essentially the same means and standard deviations!. The first correlation matrix shows the Standard Normal Variables to be uncorrelated, since off-diagonal elements are near 0. The second correlation matrix shows the simulated results for the Adjusted Random Variables, which are close to the values of the 3rd matrix, which is the correlation matrix we used to construct the Cholesky factors.

What Can We Do?

The Cholesky process allows us to model correlation patterns without disrupting the statistical characteristics of each of the individual elements. How can we use this in the 'real world'?

Improve Modeling Accuracy of Processes with Multiple Variables - rather than accept as fact the variables are uncorrelated, as the consultants did in the vignette in this post, we can use our data to ensure that any correlations that are present are factored into account as our model is developed.

Establish Non-Conventional Probability Patterns - given the ability to create correlated variables, we can use multiple variables to create probability patterns that are unique. If we want 3 "humps" or 5 in our pattern, we can create these by building up several variables and tying them together via correlations. The techniques to do this will need to be discussed in another post.

Solve Data and Mathematical Problems - the Cholesky decomposition is quite similar to taking the square root of a matrix. If we are presented with a set of data, and would like to use it in an equation, the decomposition can be useful to help us solve the equation mathematically.

Key Takeaways

The Cholesky process is used to ensure that our simulation of multiple variables evidences our desired pattern of correlation. It is also a tool to create model parameters and to solve data/mathematical problems.

You May Also Like

Usage of Cholesky matrix in an investment portfolio setting

Mei's Monte Carlo Adventure

Should You Rebalance Your Investment Portfolio?

Downloads of tools used in thie post

Excel Spreadsheet

R Code

Questions
    ::Do you have a story about using the Cholesky decomposition in a model creating situation?
    ::What other methods have you seen to account for correlation in developing models?
    ::Can the Cholesky process be used for situations where the distributions are not normal?

Add to the discussion with your thoughts, comments, questions and feedback! Please share Treasury Café with others. Thank you!