A new article published in this month’s Annual Reviews delves into the issue that many published medical findings may actually be false. This concern has been written about before, notably in PLoS in 2005, but the authors here attempt to describe the issues with data analysis practices and “point to tools and behaviors that can be implemented to reduce the problems with published scientific results.”
The authors, Jeffrey T. Leek and Leah R. Jager of Johns Hopkins, begin by defining false discoveries in medical research. They describe how scientific publishing was created before the age of modern computing, statistics, data analysis software and the Internet. The overabundance of data, compounded by the pressure to produce positive results by groups like funding agencies, has created a cloud of suspicion over published research. They believe that it “is possible to confuse correlation with causation, a predictive model may overfit the training data, a study may be underpowered, and results may be overinterpreted or misinterpreted by the scientific press.”
A major issue seems to be how it is possible to reproduce and replicate the findings of a scientific study. If a study cannot be reproducible or replicated, then the evidence (meaning incorrect data analysis) and original report could be a false discovery. Leek and Jager summarize these factors and the rates of false discoveries across medical research.
The authors provide ideas and tools to improve scientific literature and data analysis. This includes better incentives to publish, more open sharing of data, and improving the researchers’ skills in data analysis. Leek and Jager admit they might be overstating the problem of false research, but conclude that researchers must adapt to ever increasing amounts of data through best practices, training and the right tools to improve accuracy in publishing.