Fresh off our multi-part posting entitled Wikifinance, Wikitreasury, where we explored professional changes we may potentially face, we may want to focus on some aspects of the networked world that are somewhat problematic.
One of these areas is the concept of “liking” and “rating”. These were mentioned in separate blogs I ran across. In "The Future of S&P Downgrades: FYI, CYA, or LOL", the author's portend that Standard and Poors can be made obsolete, and likely will, as people become more networked. Rating debt issuances will be similar to a LinkedIn discussion or Facebook posting, we will “like” items and by virtue of this the really good debt will be higher rated.
In "Thinking About Curation in the Enterprise", the claim is “peer-reviewed and peer-ranked expertise” is something that web technology is good at.
Statistics Principle – Random Samples from a Population
The field of statistics has methodologies to help us figure out whether things are different. If we pull 10 balls out bucket A, and some are red and some are blue, and 10 balls out of bucket B, also some red and some blue, there are tests we can employ to determine how likely it is that both buckets are similar, and have about the same number of red balls and blue balls.
Underpinning these methodologies is the fact that we need to draw a random sample from the population. This means that any one ball is just as likely to be picked as any other ball. So if we pick every third ball, that is not a random sample, since every first and second ball had no likelihood of being picked. Any results of our statistical tests are not valid in this case.
What Does a “Five-Star” Rating Tell Us?
So if we are on Amazon, comparing books about blogging, and one is rated 5 stars and another is rated 4 stars, can we be confident that the book with the higher rating is a better?
From a statistics point of view, we cannot, since the ratings do not come from a random sample. The ratings do not tell us anything!
Why is this the case? Don’t we like to see polls and votes to see who wins. Don’t we use them in real life for lots of important things – US Presidents, legislation, whether our office is going for Chinese or Pizza today? Don’t the book ratings provide that same type of thing?
For one, there is a subtle difference in these examples versus ratings. For a political election, or an opinion survey, we get one vote. When we go onto Amazon, we may choose to make no ratings, one rating, or one hundred ratings.
When Encountering a Rating – Let the Viewer Beware!
What if the 5-star rating came from a total of 3 people and the 4-star rating is from 30? A statistical reason here would be that we can be more confident of the 4-star rating since it is from a larger sample size. But again, only if the sample were random.
To address this, some folks do not trust a rating unless there is a minimum number of people who have rated it. Even that is not fool-proof. What if we discovered that all of the 5-star book ratings came from friends of the authors? And one of them had a lot of friends, and the other not as many? Does the rating measure the value of the book in this case? No, it measures the number of friends.
Are two books, each with 10 ratings of 5, each as good? What if 1 out of 10 book purchasers who like a book will rate it, while 1 out of 100 of those who do not like a book will rate it? What if we were told one of the books has sold 10,000 copies and the other 100 copies?
Be careful out there!
I would love to hear your thoughts about problems with internet rating systems or your stories on this topic if you have them.
Please take the time to subscribe, bookmark, or otherwise note your web presence and support of this blog if you are able.
Thanks for stopping by the Treasury Cafe!
No comments:
Post a Comment