Douglas Hubbard was famous for saying, “you need less data than you think you do, and you have more data than you think you have.” I generally agree with. Simple models that explain simple issue are always favoured over the more complex black-box models that are all signing and all dancing.
But what happens when we disregard Hubbard’s adage? This is where leveraging happens. Incorrect data is used in inappropriate ways and extrapolated to fill gaps it should not. Let’s use the real world and topical example of Theranos. In brief, but definitely look up the full story, which is now a book and being made into a film, Theranos was a health diagnostic technology company. Theranos raised billions of venture capital money to research and develop miniaturized blood testing equipment that was not evasive. It would have revolutionized the treatment of afflictions like diabetes.
Alas, Therenos’ castle was made of sand. Their initial premises was founded on academic research that was not peer reviewed and claimed to use a smaller blood research machine that could perform multiple tests including immunoassays, general chemistry assays, haematology assays with a fraction of the required sample of competitors. The reason why their competitors where larger and bulkier was that they needed to hold more bloods samples of a higher quantity to perform a single test accurately. Theranos was leveraging spurious research about blood consistency used to make medical assumptions, combine with a flawed machine learning method that would fill In prediction gaps. Once peer reviewed, Theranos’ statistical significance was non-existent varying wildly between samples. The company has had a rather dramatic and public downfall that should be a warning to all about the limits of data and research.
In ‘Why Most Published Research Findings Are False’ by John Ioannidis, the causes and effects of incorrect data interpretation and how even commonly accepted outcomes have a high probability of being false. Go read it, it’s great! That is the take away from this post, dear reader. Unless you are an academic researcher with a wealth of statistical knowledge and a pit of money to fund your experiment with a robust peer review process, keep it simple and learn to both live with and estimate bias and error. Being honest about what you know and don’t know is not a weakness, it actually shows a level of insight most people don’t have.
Never leverage what you know into something you don’t know. Instead offer a potential solution. “If we knew this” or “if we had data on that” is a perfectly fine thing to say. If you keep your business from going down the Theranos track, they will thank you for it. Maybe not now, but they will down the road.