Monday, February 17, 2014

Why Projects and Budgets Have Fluff - a Lognormal Explanation

Tuck Wallace, the company's CFO, was calling her. She could see it on the Caller ID.

"This is Mei" she answered. Protocol dictated that you never acknowledged that you knew who was calling, even if you did.

"Hi Mei, it's Tuck Wallace" he said, also following protocol (he knew everyone had Caller ID too!).

"Hi Tuck, sure hope you weren't involved in that massive freeway tie up due to the bad weather"

"No, thank goodness. I come in from south of the city and the problems today were mostly north of town. Did you?" he asked.

"I'm usually in before there's too much traffic, so it was a little slow today but manageable. What can I do for you?"

"How soon do you think before you get a complete set of numbers for the budget?"

Mei laughed. "Tuck, I thought you knew better than to ask a question like that!"

The lognormal curve, one of a class of assymetric statistical distributions, has many uses, but only rarely do we get to use it to understand our company's communication and behavior.

The Twin Tension

Eliyahu Goldratt, a pioneer in the management system known as the Theory of Constraints says that a manager needs to guard their reputation within the organization against two things - being considered unreliable and being considered uncredible.

The consequences to this are dire for the person. If we are unreliable, then when the big, skill-developing, challenging, and high-impact projects come along, we will not get picked for them.

If we are not credible, then we are in a situation similar to the "Boy Who Cried Wolf" - we are not believed even when we should be believed, because our reputation has preceded us. Thus, our insights, advice, and analysis will be ignored during the company's decision making process, eliminating our ability to impact the organization and/or our industry in any meaningful way.

Once these labels have been placed it is difficult to get them removed.

Which leads us to the conclusion that the best state of affairs is to avoid getting them stuck to us in the first place.

But why is reliability and credibility a tension?

The Project Manager's Dilemma

The situation Mei finds herself in at the opening of this post places us in the right context - how long do we tell someone it will take us to complete a task?

Simple question.

Simple answer? Not so much.

Estimates of time to completion are critical factors during the project management process. Many of the project manager's tools - Gannt charts, critical paths, completion timetables, schedules, bottleneck flags, milestones, etc. - rely on these estimates for the various tasks that make up the project.

Consequently, project managers (such as Tuck Wallace is if we view the budget to be a project) are obsessed with the status of task completion and the timing of it.

According to Goldratt, if we say it will take 10 days and we get it done in 2, we will lose credibility. If it happens a couple of times, people receiving our estimates will begin to shade them back in their minds - "the last few times they said 10 hours and it took them 2, so they said 5 this time so it will probably take them 1"

In this situation, even if the person does it in 3, they will be considered late since the project manager was expecting 2, even though it was within the estimate of 5 that was given!

Similarly, if we say that it will take us 5 hours to complete the task, and it takes us 10, then we will be considered unreliable. "Those guys are always late" "We can never believe their estimates"

One of the wrinkles in this process is that the estimate range is not normal, but skewed. Time cannot be negative ("it will take us -2 hours"), so rather than being able to think about things from an evenly balanced, plus or minus, viewpoint we are faced with an unbalanced situation.

The problems will all occur on only one end of the range. If things go well, the time doesn't change that much. But if one thing goes bad, and then another, and then another, then the time to completion is dramatically increased.

Lognormally Speaking...

This skewness in results means we can picture this estimation process using the lognormal curve.

Figure A
Lognormal Time Estimates

Figure A shows a lognormal curve based on a normal distribution with a mean of 1 and a standard deviation of 1. There are several items of interest in this. Let's say that this distribution is our estimate of time to complete a task.

The median value, 2.72 days, is the "pure coin flip" outcome. There is a 50% chance of being higher and 50% of being lower.

Do we provide this estimate?

Under Goldratt's paradigm, we essentially seal our fate one way or the other. We have a 50% chance of not being credible, and a 50% chance of being unreliable - neither a very good outcome!

The average of this distribution is 4.48 days. The fact that the average is so much higher than the median reflects the fact that there is a long tail on the right side of the distribution but not on the left. Things can only be so good, we can get the job done this instant but not any sooner, but if things go bad it can take a really, really, really long time.

Do we use the average as our estimate? In 69% of the cases, we will be below this estimate (and hence not credible in the future).

The shaded regions of Figure A show the values of the bottom 20% and top 20% of the outcomes. There is a 20% chance we will take less than 1.17 days to complete this task. We could say it will take at least 1.17 days and be right 80% of the time, though this merely provides "a floor" for our estimate, and does not give the project manager anything meaningful to put into their software for task completion.

We could also say it will probably take less than 6.31 days, and likely come inside that mark. With the wide range of possible outcomes significantly below that, our credibility factor will again be at issue if we use this approach.

Finally, we could use the bottom and top to establish a range - "In the majority of cases, we will get this done somewhere between 1.17 and 6.31 days". We would be within this range over half the time, so the odds are slightly in our favor of being credible.

The big problem with the range approach is that the people hearing it do not like it. A single number needs to go into the software, or a significant event or task needs to be scheduled for a certain date.

An example of this would be booking a conference room at a hotel for the project team after a certain milestone has been completed. They simply can't call up the hotel and say "We'd like to reserve a conference room, can we come sometime between this Monday and next?"

So instead, upon hearing a range, the person gets frustrated and exasperated and says "Just tell me the answer!" and, as they are saying that, the 'thought bubble' running through their mind goes "they are trying to avoid being pinned down to something, they do not want to be held accountable, therefore they must be a slacker. I must continuously watch this slippery character."

Again, not good for our "brand" no matter if our estimate is correct or not.

The "Solution"?

There is a common way to avoid this dilemma.

Step 1: We determine an acceptable "risk" of not meeting the estimate and provide that amount. In our Figure A example, let's say we want to have 80% confidence, so we say it will take us 6.31 days.

Step 2: we do the work.

Step 3: if we get our work done before our estimate (which will happen 80% of the time), we wait to deliver it at some point close to the estimate we provided.

In essence, we establish enough "cushion" in our estimate in order to protect ourselves most of the time. In this way we avoid being labeled unreliable and uncredible!

Changing Perspective

How does our ideal solution look from the organization's point of view?

If we assume our project task is one of ten that make up the project, and all the project steps have the same distribution as Figure A, then there is a lot of waiting time that has been "baked in".

For our task, which we will deliver in 6.31 days, exactly half of the time we will have gotten it done in 2.72 days, so there is just about 3 and 1/2 days of waiting. Multiply that figure by 10 for the number of steps, and our project could be done a month sooner!

For those of you who have read Cholesky to the Rescue, you might remember that when we combine statistical distributions the standard deviation is lower for the combined outcome vs. the single outcome, so perhaps it is not as bad as all that.

In order to compare outcomes between a "deliver when finished strategy" and the "cushion strategy", we simulated in R this ten step process, assuming in the first that tasks were delivered as completed and in the second they were delivered in 6.31 days unless they took longer. We further assume that all steps are not correlated (thus no Cholesky required).

Figure B
Days to Completion
Probability of completion of project. Days to complete a project is consistently higher except at the very extreme upper end.

Figure B shows the outcome of this simulation by plotting completion times in sorted order. Convergence of the two lines (i.e. when the completion time is similar), occurs less than 1% of the time, and in fact the Cushion Strategy never is less than the Deliver When Finished approach.

Figure C shows the R output of the statistics for project completion times. The difference between the two is about 30 days no matter whether we use the median as the average or the mean.

Figure C
Days to Completion Comparison
The "Cushion Strategy" is about 30 days more than the "Deliver When Finished" approach when comparing on both median average and mean average basis.

The standard deviation of the series is slightly lower for the Cushion approach, which is not at all surprising since the vast majority of the times each task has a constant 6.31 day completion time. The standard deviation of the Deliver When Completed approach is 18 days, which is much less than the 5 day standard deviation for each individual task multiplied by 10.

Thus, from the organization's point of view, this cushioning adds waste and inefficiency into its performance.

Budget, Shmudget

We have seen how the act of adding cushion, essential to the individual's political survival, adds to organizational inefficiencies.

It is also financially inefficient as well.

Why?

We can 'transpose' the dynamics of the time to project completion estimates just discussed to budget estimates. When using the 'transpose' term we are thinking about it in a musical context - if we first sing "Do Re Mi" in the key of C, starting at that note on a piano, we can then sing it in a different key, such as G, starting at that point on the piano, and the melody is still recognizable as "Do Re Mi" (just a little higher or lower). This occurs because the steps between the notes are exactly the same no matter where it is played - we have merely started the sequence at a different point on the piano.

Likewise, the steps in the budgeting process are similar to the project time estimation process in a number of ways:

    ::From a cost perspective the numbers cannot go below 0, meaning our distribution will be lognormal or otherwise skewed.
    ::Since there are generally consequences to not hitting budget - such as low bonuses, loss of jobs, etc. - people are inclined to 'cushion' in order to prevent these downsides.
    ::There are the same negative reputational consequences - loss of credibility and reliability. Once we deliver 10% under our budget for the year, our budget estimates for the next year and every year thereafter get shaved back from our original proposals.
    ::Life is a lot harder once we go over budget. After we break through that threshold, every additional expense is intensely analyzed, reviewed and questioned - "Hey, you don't really need to read the Wall Street Journal, do you?"

Figure D
Budget Estimates

Figure D shows a lognormal curve for a potential budget scenario. This curve was created with a lognormal average of ♢ 10 million (for new readers the ♢ symbol stands for Treasury Cafe Monetary Units, or TCMU's, a currency freely exchangable with any other currency at any exchange rate of your choosing) and a standard deviation of ♢ 2 million.

This curve is less skewed than the labor estimate one, with one of the results being that the mean and median are much closer together (in a normal curve mean and median are the same), but the skewness is still there. From the median to the lower 20% is a difference of about ♢ 1.5 million, while to the top 20% it is ♢ 1.8 million.

And this is part of the lognormal lesson - the more cushion we need, the further out the curve we need to go to get it, because of the assymetry on that side.

Rather than simulate this as 10 equal steps, we can think about this in terms of proportion. If we are cushioning at the 80% level, we have about ♢ 1.8 million. This is about 18% of the median. We can then apply this to the total budget - if it's ♢ 100 million we have ♢ 18 million of cushion, if it's ♢ 500 million we are likely to have about ♢ 90 million.

That is a lot of cushion!

Furthering the problem is that cushion is unlikely to be spent to serve major organizational objectives. Since the point of cushion is to buffer our margin of error, we can't let others know it's there, otherwise we are back to the original unreliable / uncredible dilemma. So if we are spending cushion it cannot be anything too noticable. So objectives such as share buybacks, new growth investments, etc. do not benefit from this approach. Things such as Wall Street Journal subscriptions do.

From the CFO's point of view, this dynamic translates to the external communication function as well. The market's reaction to earnings surprises also reflects an assymetric tendency, where negative ones are reacted to much more than positive ones (for further info see this research paper by Skinner and Sloan at the University of Michigan). We can interpret this using Figure D - "if they cushioned, things must be so bad they wound up in the shaded regions".

What Can We Do?

It is tempting to say "let's get rid of the budget" as a response to the above. However, since a lot of organizations rely on budgets as part of their governance process, this is not always a realistic alternative.

Part of the difficulty here is that by adding these cushions we are in some respects lying. We believe we will deliver things inside these numbers the vast majority of the time. And, should we be successful at that, we are going to pretend we didn't so far as anyone else knows.

So what other things might we be able to do?

Blame it on the model - if we use a model to calculate our estimates, and we let everyone know it, sometimes we are able to attribute estimate errors to the model usage. "The Black Scholes formula really did not do a great job of forecasting office copier expense this year". This is similar to the division President attributing the Net Income shortfalls to "Allocated Overhead", which is based on a complex methodology nobody understands.

Tolerate ambiguity and uncertainty - underlying the topic of this post is the fact that we need to provide estimates that people can rely on. This is fundamentally impossible, since none of us can predict the future. The more uncertainty that we can tolerate, the more we can accept ranges of possible outcomes without needing to know the answer. After all, the only certainties in life are death and taxes, right? So in the case of budgets and projects, just suck it up and deal with the fact that there is uncertainty.

Don't tie negative events to the estimation process - if we do not get a bonus if we go over budget, the more cushion that budget will contain. If we separate the budget process from the organization's carrots and sticks, the less need there will be to add cushion. We can have more open, frank and candid dialogues about the driver's of the numbers, which ultimately are the items that need to be managed.

While the actions above will work more often than not, by far the best thing we can do is:

Establish a great working relationship with others - budget and project estimates are few and far between in the grand scheme of things. If we demonstrate reliability and credibility in our ongoing relationships, day in and day out - with our bosses, those upstream and downstream in our workflows, and everywhere alse that we can - then an occasional blip here and there, for a good reason, will be forgiven because of the strong relationships we have established. However, a relationship is one that is reciprocated. We can have a great relationship with someone, but if that someone will 'throw us under the bus' if it serves their interests, then the relationship can only be developed so far.

Take the time to explain - if we sit down with the other and 'step them through' the thought process and modeling that have led us to the conclusion we've arrived at, we provide them a number of opportunities: appreciation of the seriousness with which we have responded to their request; ability to ask questions about the process, information generated and the results; and a tangible experience of the train of thought that we ourselves have experienced. Because this has occurred, they are in a much better position to understand why things might come in better or worse than originally estimated. They no longer intepret the event in terms of our credibility or reliability, because they themselves had originally come to the same conclusion themselves.

Key Takeaways

In order to be impactful, we need to maintain a reputation of credibility and reliability within our organization. Tasks that require estimates, such as projects and budgets, are ones that can potentially undermine this reputation. Understanding the skewed nature of the estimation process helps us manage these potential impacts.

You May Also Like
Questions
    ::What are actions you have been able to take that made your estimates easier to deliver?
    ::What suggestions would you give to Mei to help her answer Tuck's question?

Add to the discussion with your thoughts, comments, questions and feedback! Please share Treasury Café with others. Thank you!

1 comment:

  1. I usually follow schedules to a tee. So I wonder why it includes fluff.

    ReplyDelete