A high-profile workshop Statistical Challenges in Assessing and Fostering the Reproducibility of Scientific Results was organized by the National Academies of Sciences and the National Science Foundation and took place in Washington (DC) last year. The workshop summary report was recently published by the National Academies Press.
Here is an extract, which lists recommendations from the panel discussion:
- Establish publication requirements for open data and code. Journal editors and referees should confirm that data and code are linked and accessible before a paper is published. (Keith Baggerly)
- Clarify strength of evidence for findings. The strength of evidence should be clearly stated for theories and results (in publications, press releases, etc.) to ensure that initial explorations are not misrepresented as being more conclusive than they actually are. (Keith Baggerly)
- Align incentives. Communities need to examine how to build a culture that rewards researchers who put effort into verifying their own results rather than quickly rushing to publication (Marcia McNutt)
- Improve training.
- Institutions need to make extra efforts to instill students with an ethos of care and reproducibility. (Marcia McNutt)
- Universities need to change the curriculum to incorporate topics such as version control, code review, and general data management, and communities need to revise their incentives to improve the chances of reproducible, trustworthy research in the future. Steps to improve the future workforce are necessary to keep the public trust of science. (Randy LeVeque)
- Many graduates are well steeped in open-source software norms and ethics, and they are used to this as a normal way of operating. However, they come into a scientific research setting where codes are not shared, transparent, or open; instead, codes are being built or constructed in a way that feels haphazard to them. This training disconnect can interfere with mentorship and with their continuation in science. Better understanding of these norms is needed in all levels of research (Victoria Stodden)
- Prevention and motivation need to be components of instilling the proper ethos. This could be part of National Institutes of Health (NIH)-mandated ethics courses. (Keith Baggerly)
- Clarify terminology. A clearer set of terms is needed, especially for teaching students and creating guidelines and best practices. Some examples of how to do this can be found within the uncertainty quantification community, which successfully clarified the terms verification and validation that were almost used synonymously 10-15 years ago. (Ronald Boisvert)
The authors of these recommendations are:
Keith Baggerly, a Professor at UT MD Anderson Cancer Center, best known as a practitioner of “forensic reproducibility” in bioinformatics.
Ronald Boisvert, the head of the Applied and Computational Mathematics Division at the National Institute of Standards and Technology (NIST).
Randy LeVeque, a Professor of Applied Mathematics at the University of Washington and the author of Top Ten Reasons To Not Share Your Code (and why you should anyway).
Marcia McNutt, the editor-in-chief of Science and the president-elect of the National Academy of Sciences.
Victoria Stodden, Associate Professor at the University of Illinois at Urbana-Champaign and a co-editor of Implementing Reproducible Research.