This post on the temptation of data raised an interesting aspect of Big Data: that we run the risk of being overwhelmed by data and that organisations and investors are looking too hard for the one piece of data that will be the key to success. Drowning in data is a real risk to an enterprise and something that great leaders are aware of ( lesson 3 in this awesome leadersip slidedeck from Colin Powell – “Experts often posses more data than judgement”) but being swamped by the data is not the only problem.
Another problem with casting the net of big data wide in this way is that eventually you will end up finding some sort of pattern, any sort of pattern if you keep looking hard enough (there’s an XKCD for this). The human brain is a fantastic pattern recognition system and is easily fooled (see Pareidolia) and it’s far too easy to make mistakes like confusing correlation with causation.
The normal solution to this is to use statistics. At first glance this is sounds fair but but how many start-ups have expert statisticians on staff to prevent them making mistakes? Even if a business is lucky enough to have experts, does that mean that you can trust their predictions? A recent bit of research would say no. An article that looked at how statistics were performed in neuroscience research found a basic error in a type of statistics (the difference in differences) that is prevalent in analysing what are effectively A/B tests. This error was so prevalent it was seen in more than half the papers that performed this sort of analysis! Every scientist will have at least a passing expertise with statistics given the importance of significance in that field and typically they will have a battery of well understood statistical tools, chi-squared, students t-test etc, and will know how to use them. That neuroscientists can make such simple errors (and not just because I used to be one!) says a lot about how easy it is to get statistics wrong.
So what should the business person take from this? Simply that Big Data is both a complex topic and an amazing tool for business but it is a tool and no more than that. Data should be used to colour to business leadership not be used as a substitute.