One of your risk professionals comes up to you and says “We should purchase this analytic system because it successfully predicted the downfall of the financial institutions that failed during 2008, it’s great!”How should we respond?
To answer this question, we need to take a small detour into the wonderful world of statistics.
Statistical comparisons involve two sets of data, the “control” group and the “treatment” group. There are two possible relationships between these two groups - either there is not anything conclusive to show that they are really that different (the Null Hypothesis) or there is (the Alternative Hypothesis).
The person doing the investigating is usually looking to provide evidence for the Alternative Hypothesis. Alternative Hypothesis’ are based on questions such as: Does this drug work? Are guilty people sent to jail? Does this model adequately predict firms’ financial ruin?
The answer to this is determined using statistical techniques. Two outcomes are possible, 1) we “fail to reject the Null hypothesis”, meaning there is no evidence to support that the two groups are not the same, or 2) we “reject the Null hypothesis”, meaning the two groups are significantly different.Poor Mr. or Ms. Null!
Imagine proposing to someone that way – nice dinner, get down on your knee, bring out the ring box, and say “I fail to reject you as a spouse”! Pretty romantic, isn’t it?
I am sure some know the technical reason for this (if so, please leave a comment!), but the thing I think of (maybe because it is easier to remember) is Nassim Taleeb’s discussion in “The Black Swan”, which went something like the following:
We can count 10,000 swans, or as many as we have ever seen (if more), and they can all be white, but that does not prove “all swans are white”. It just means the ones we have observed are.
However, we can count 10 swans, or as few as 2, and if one of them is black, that does prove that “not all swans are white”.One thing is provable, one is not. Thus the funky terminology about “rejecting or failing to reject” the Null hypothesis.
So we have two statistical outcomes of the data based on the Null Hypothesis, and two sets of data in real life that may or may not be different.
Any consultant knows that this should become a 2x2 matrix! This matrix will have one axis cover the statistical conclusions regarding the data and the other axis describe the actual reality of the data.
In two of the four boxes, reality matches the conclusion, and in the other two it does not.The Romance Continues - Adding Error to Rejection
Statisticians have come up with some great terms for the other two boxes in this matrix. It is with great pride that I present to you these two inspired terms:
· Type I error
· Type II errorWe can all see why they aware PhD’s for this kind of stuff!
Our matrix is now complete, as follows:
We Now Conclude Our Detour and Return to Our ScenarioGoing back to our original scenario, we have been presented with this fantastic but costly software that is able to predict with 100% accuracy a firms’ financial failure.
The question that we need to ask is “what is the full set of prediction data this model generated?”, and how does this compare to the alternative?
In the actual outcome of the above, this financial model predicted that over 30 financial institutions were going to fail! In other words, it threw out the baby with the bathwater!Key Takeaways
While we do not want to “throw out the baby with the bathwater”, nor do we want to “keep the bathwater with the baby!”Questions
· What has been your experience with Type I and Type II error
· What has been your experience regarding the omission of one of the two?
Add to the discussion with your thoughts, comments, questions and feedback! Please share Treasury Café with others. Thank you!