Why rankings of biomedical image analysis competitions should be interpreted with care

Maier-Hein, Lena; Eisenmann, Matthias; Reinke, Annika; Onogur, Sinan; Stankovic, Marko; Scholz, Patrick; Arbel, Tal; Bogunović, Hrvoje; Bradley, Andrew P.; Carass, Aaron; Feldmann, Carolin; Frangi, Alejandro; Full, Peter M.; van Ginneken, Berbke; Hanbury, Allan; Honauer, Katrin; Kozubek, Michal; Landman, Bennett; März, Keno; Maier, Oskar; Maier-Hein, Klaus; Menze, Bjoern H.; Müller, Henning; Neher, Peter F.; Niessen, Wiro; Rajpoot, Nasir; Sharp, Gregory C.; Sirinukunwattana, Korsuk; Speidel, Stefanie; Stock, Christian; Stoyanov, Danail; Taha, Abdel Aziz; van der Sommen, Fons; Wang, Ching-Wei; Weber, Marc-André; Zheng, Guoyan; Jannin, Pierre; Kopp-Schneider, Annette

doi:10.1038/s41467-018-07619-7

Maier-Hein, L. (Lena), Eisenmann, M. (Matthias), Reinke, A. (Annika), Onogur, S. (Sinan), Stankovic, M. (Marko), Scholz, P. (Patrick), T. Arbel (Tal), H. Bogunović (Hrvoje), Bradley, A.P. (Andrew P.), A. Carass (Aaron), et al.

2018-12-01

Why rankings of biomedical image analysis competitions should be interpreted with care

Nature Communications , Volume 9 - Issue 1

International challenges have become the standard for validation of biomedical image analysis methods. Given their scientific impact, it is surprising that a critical analysis of common practices related to the organization of challenges has not yet been performed. In this paper, we present a comprehensive analysis of biomedical image analysis challenges conducted up to now. We demonstrate the importance of challenges and show that the lack of quality control has critical consequences. First, reproducibility and interpretation of the results is often hampered as only a fraction of relevant information is typically provided. Second, the rank of an algorithm is generally not robust to a number of variables such as the test data used for validation, the ranking scheme applied and the observers that make the reference annotations. To overcome these problems, we recommend best practice guidelines and define open research questions to be addressed in the future.

Additional Metadata
Persistent URL	doi.org/10.1038/s41467-018-07619-7, hdl.handle.net/1765/113081
Journal	Nature Communications
Grant	This work was funded by the European Commission 7th Framework Programme; grant id fp7/318068 - VISual Concept Extraction challenge in RAdioLogy (VISCERAL)
Organisation	Department of Radiology
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Maier-Hein, L. (Lena), Eisenmann, M. (Matthias), Reinke, A. (Annika), Onogur, S. (Sinan), Stankovic, M. (Marko), Scholz, P. (Patrick), … Kopp-Schneider, A. (Annette). (2018). Why rankings of biomedical image analysis competitions should be interpreted with care. Nature Communications, 9(1). doi:10.1038/s41467-018-07619-7

Free Full Text ( Final Version , 813kb )

Why rankings of biomedical image analysis competitions should be interpreted with care

Publication

Publication

About

Why rankings of biomedical image analysis competitions should be interpreted with care

Publication

Publication

Workflow

Workflow

Add Content